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NUMERICAL  METHODS  IN  CONTINUUM  MECHANICS 


FINAL  TECHNICAL  REPORT 

1.  STATEMENT  OF  THE  PROBLEMS  STUDIED 

Five  different  problems  are  studied  in  this  project.  They  are  stated 
briefly  below: 

(a)  HETEROGENEOUS  DIFFUSION  AND  THE  COMPACT  VOLUME  METHODS 

In  this  paper  we  study  heterogeneous  diffusion  in  the  context  of 

heat  conduction  in  laminated  material,  heat  conduction  with  phase  change 
(the  Stefan  problem)  and  fluid  flow  in  porous  media  (Richards' 

equation).  We  discretize  these  problems  via  the  compact  volume  method 
(CVM).  We  illustrate  that  this  method  is  higher  order  accurate  in  the 
flux  error  than  traditional  methods.  The  resulting  nonlinear  algebraic 
systems  are  approximated  by  different  versions  of  the  SOR  and  ADI 

schemes.  Vector  and  multiprocessing  implementations  were  presented, 
which  indicated  a  suitability  of  these  methods  for  high  performance 
computing. 

(b)  INDUCED  CONTRACTIVE  METHODS  FOR  FREE  BVPs 

In  this  paper  we  consider  numerical  schemes  for  the  solution  of 

nonlinear  algebraic  systems  which  evolve  from  the  discretization  of  the 
Stefan  problem,  and  from  the  fluid  flow  in  a  porous  media  problem, 
Richards'  equation.  We  presented  analysis  of  induced  contractive  methods 
for  free  boundary  value  problems.  Both  the  ADI  and  SOR  versions 
vectorize  very  well.  In  3D  problems  with  complicated  nonlinear  terms  we 
expect  the  ADI  version  to  be  more  robust  and  to  perform  equally  as  well 
as  with  the  SOR  version. 

(C)  NUMERICAL  SOLUTION  OF  FLUID  FLOW  IN  PARTIALLY  SATURATED 
POROUS  MEDIA 

This  paper  describes  an  SOR  algorithm  for  solving  the  nonlinear 
algebraic  system  which  evolves  from  Richards'  equation  that  models  fluid 
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flow  in  a  porous  media.  The  moisture  content  and  hydraulic  conductivity 
functions  are  approximated  by  piecewise  linear  functions  obtained  from 
field  data.  The  resulting  algebraic  system  is  solved  by  a  variation  of  the 
nonlinear  SOR  algorithm.  The  advantage  of  this  approach  is  that  it  avoids 
some  of  the  numerical  oscillations  associated  with  large  derivatives  in 
the  data.  Numerical  calculations  are  presented  and  illustrate  the 
following:  (I)  agreement  of  the  numerical  model  with  observed  data, 

(ii)  dependence  and  comparison  results  as  a  function  of  uncertain  data, 
and  (iii)  suitability  of  these  algorithms  for  multiprocessing 
computations  via  domain  decomposition  methods.  Extension  of  these 
algorithms  to  heterogeneous  porous  media  fluid  flow  are  discussed. 

(d)  A  COMPACT  FINITE  VOLUME  SCHEME  FOR  2-D  STEFAN  PROBLEMS  AND 
VECTOR/MULTIPROCESSOR  COMPUTERS 

We  consider  both  the  compact  finite  volume  and  finite  difference 
space  discretizations  of  the  Stefan  problem.  The  resulting  algebraic 
systems  are  solved  by  nonlinear  versions  of  ADI  and  SOR.  Both  algorithms 
contain  significant  parallelism  which  is  demonstrated  on  two 
vector/multiprocessing  computers,  the  Alliant  FX/40  and  the  Cray  Y-MP. 
Numerical  experiments  indicate  that  the  compact  discretization  and  ADI 
give  the  best  accuracy  with  the  minimum  computational  cost. 

(e)  A  COMPARATIVE  STUDY  OF  COMPACT  FINITE  VOLUME  METHODS  FOR 
THE  2-D  AND  3-D  DIFFUSION  EQUATIONS  WITH  FINITE  DIFFERENCE  ADI 
AND  SOR 

Recently  developed  compact  finite  difference  scheme  (CPT)  were 
applied  to  two  and  three  dimensional  diffusion  equations.  The  relative 
merits  of  CPT-ADI  were  investigated  with  other  computational  schemes 
such  as  finite  difference-ADI  and  FDM-SOR.  The  numerical  results 
obtained  from  these  three  approaches  are  compared  to  known  analytical 
solutions.  According  to  our  results  CPT-ADI  was  found  to  be  a  superior 
scheme  with  regard  to  accuracy  and  speed. 
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2.  SUMMARY  OF  THE  MOST  IMPORTANT  RESULTS:  THREE  MOST 
IMPORTANT  RESULTS  ARE  GIVEN  BELOW: 

(i)  We  have  given  an  analysis  of  induced  contractive  methods  for  free 
boundary  value  problems.  The  SOR  version  and  the  ADi  version  of 
Algorithm  seemed  to  perform  equally  well.  However,  the  SOR  version  was 
very  sensitive  to  the  choice  of  SOR  parameter,  which  contrasts  with  the 
ADi  method  which  was  observed  to  be  fairly  robust  with  respect  to  its 
acceleration  parameters. 

Both  AOI  and  SOR  versions  vectorize  very  well,  and  as  expected,  in 
3D  problems  with  complicated  nonlinear  terms  the  ADI  version  was  found 
to  be  more  robust  but  perform  equally  as  well  as  with  the  SOR  version. 

(ii)  We  have  shown  for  a  number  of  heterogeneous  diffusion  problems 
that  compact  volume  discretization  method  gives  higher  order  error 
estimates  for  both  function  and  flux  errors.  Moreover,  we  have 
illustrated  that  both  the  ADI  and  SOR  algorithms  can  be  adapted  to  solve 
the  resulting  nonlinear  algebraic  syatems.  Vectorization  and 
multiprocessing  computers  were  shown  to  be  effective  in  executing  these 
algorithms. 

In  the  applications  to  the  heat  conduction  in  a  laminate  materials, 
the  Stefan  problem,  and  Richard's  equation,  we  have  indicated  how  one  can 
extend  the  compact  volume  method  to  2D  and  3D  space  problems.  Ail  these 
problems  are  of  current  interest  to  physical  scientists  and  engineers. 
Moreover,  the  mathematicai  analysis  of  convergence  problem  using  the 
more  complicated  versions  of  CVM  is  currently  needed. 

(iii)  The  Richard's  equation  is  approximated  by  the  finite  difference 
method,  and  the  empirical  data  for  the  moisture  content  and  the  hydraulic 
conductivity  were  approximated  by  pieecewise  linear  functions.  The 
resulting  non-linear  algebraic  system  is  solved  by  a  variation  of  non¬ 
linear  SOR  iterative  method. 

Good  convergence  properties  are  observed  for  three  types  of 
calculations  which  were  chosen  to  demonstrate  the  feasibility  of  realistic 
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numerical  simulations  using  the  SOR  iterative  methods.  These  includes  an 
accurate  simulation  of  fluid  flow  in  Brindabella  Loam,  a  sensitivity 
analysis  of  the  computed  solution  upon  the  empirical  data,  and  the  use  of 
multiprocessing  computers  via  domain  decomposition  methods. 

We  expect  the  methods  of  this  paper  to  generalize  via  the  compact 
volume  method  to  the  more  complicated  heterogeneous  case. 

3.  LIST  OF  PUBLICATIONS  AND  TECHNICAL  REPORTS: 

The  following  scientific  articles  are  either  published  or  sent  for 
publication: 
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White,  B.  N.  Borah,  A.  J.  Kyriilidis;  sent  for  publication  to  the  Journal 
of  Scientific  Computing. 

(U)  Heterogeneous  Diffusion  and  the  Compact  Volume  Method,  R.  E.  White, 
B.  N.  Borah,  A.  J.  Kyriilidis;  sent  for  publication  to  Journal  of  Scientific 
Computing. 

(iii)  Numerical  Solution  of  Fluid  Flow  in  Partially  Saturated  Porous 
Media,  A.  J.  Silva  Neto,  R.  E.  White;  sent  for  publication  to  Journal  of 
Mathematical  Computing. 
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5-8. 


(v)  A  Comparative  Study  of  Compact  Finite  Volume  Methods  for  the  2-D 
Diffusion  Equations  with  Finite  Difference  ADI  and  SOR,  B.  N.  Borah,  R.  E. 
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ATTACHMENT  #1  induced  Contractive  Methods  for  the  Free 
Boundary  Vaiue  Probiems 


Induced  Contractive  Methods  for  Free  BVPs^ 


b  y 

R.  E.  Whitel,  B.  N.  Borah^,  A.  J.  Kyrillidisl 


Abstract.  In  this  paper  we  consider  numerical  schemes  for  the  solution  of 
nonlinear  algebraic  systems  which  evolve  from  the  discretization  of  the  Stefan 
problem,  and  from  the  fluid  flow  in  a  porous  media  problem,  Richards'  equation. 
Both  problems  generate  systems  of  the  form  E  -i-  At  A  P(E)  =  d  where  P(E)  is  continuous 
and  nondecreasing.  In  the  Stefan  problem  A  will  be  a  constant  symmetric  matrix. 
For  Richards'  equation  A  will,  in  general,  be  nonconstant  and  nonsymmctric. 
However,  in  both  cases  A  is  an  M-matrix.  We  consider  nonlinear  SOR  and  nonlinear 
ADI  schemes  which  may  be  implemented  on  vector/multiprocessing  computers. 


Subject  Classifications.  Primary  65H10:  Secondary  76S0S.  76T0S. 

^  Department  of  Mathematics.  North  Carolina  State  University 
^  Department  of  Mathematics.  North  Carolina  A&T  State  University 
^  Sponsored  by  the  Army  Research  Office.  RTP,  NC,  Contract  No.  DAAL03-90-G-0126 


§1.  Introduction.  The  primary  objective  of  this  paper  is  ilic 
development  of  algorithms  which  are  applicable  to  the  Stefan  problem  and  to 
Richards'  equation  for  fluid  flow  in  a  porous  media,  see  Richards  [6],  Freeze  and 
Cherry  [1],  and  Paniconi  et  al.  [S].  Both  these  problems  have  similar  formulations, 
and  therefore,  some  of  the  algorithms  which  have  been  used  for  the  Stefan  problem 
may  be  applicable  to  the  more  complicated  Richards'  equation.  In  Silva  Neto  and 
White  [7]  and  [8]  a  local  nonlinear  SOR  algorithm  was  used  for  the  numerical  solution 
of  Richards'  equation.  In  the  present  paper  we  describe  methods  which  evolve  from 
the  contractive  algorithm  in  R.  E.  White  [11].  and  this  is  reviewed  in  the  second 
section  of  this  paper.  Here  we  describe  nonlinear  SOR  and  nonlinear  ADI  iterative 
methods  which  can  be  used  for  two  and  three  space  dimension  problems.  We  will 
emphasize  the  implementation  on  vector/multiprocessing  computers. 

In  the  contractive  algorithm  the  linear  solve  step  can  be  approximated  by  an 
iterative  method,  and  an  induced  contractive  scheme  is  formed.  This  results  in  a 
nonlinear  nested  iteration  which  is  described  in  section  three.  In  sections  four  and 
five  we  study,  as  special  induced  contractive  methods,  the  nonlinear  SOR  and  ADI 
schemes.  Section  six  contains  numerical  illustrations  for  the  Stefan  problem.  In 
section  seven  the  similarities  of  the  Stefan  problem  and  of  Richard's  equation  are 
discussed.  Computations  are  done  which  illustrate  that  this  method  gives  good 
comparisons  with  the  computed  and  with  the  observed  moisture  data  for  Brindabella 
loam  [10].  The  last  section  contains  the  conclusions  and  recommendations. 

§2.  The  Contractive  Algorithm.  In  this  paper  we  consider 
nonlinear  problems  of  the  form 

£  +  AtA/3(£)  =  </  (1) 

where 

d.Ee 

A  e  91NxN 

P(E)  =  [Pi(Ei)]  e  91N, 

Pi:9l  -^91  i  =  1 . N. 

For  the  Stefan  problem  E  is  the  enthalpy  and  P(E)  reflects  the  temperature.  The 
horizontal  part  of  Figure  1  corresponds  to  the  latent  heat. 


E 


Figure  1.  Pi(E) 

Problems  of  the  above  form  evolve  from  the  implicit  time  discretization  of 
nonlinear  parabolic  equations.  The  matrix  A  is  from  the  elliptic  part  and  is  usually 
an  M-matrix.  see  [4],  or  a  symmetric  positive  definite  (SPD)  matrix.  Here  we 
emphasize  the  M-matrix  case  where  A  may  not  be  symmetric.  The  SPD  matrix  case 
can  be  studied  in  the  context  of  monotone  mappings,  see  sections  6.4  and  12.1  in  [4]. 

The  function  P(E)  =  [PjCE^)]  €  91^  where  E|,  E|  >  0  has  the  following  properties; 


(Ej  -  E  i)  c j  2  Pi(Ei)  -  Pi(E  j). 

o 

V 

o 

• 

(2.1) 

Pi(Ei)  "  Pi(E  j)  2  C2  (E|  -  E  j). 

C2^0, 

(2.2) 

Pi(Ei)  ^  • 

C3>0. 

(2.3) 

The  standard  example  is  from  the  Stefan  problem,  but  many  other  physical  problems 
can  be  put  into  the  above  form,  see  [2].  Later  we  will  emphasize  the  case  evolving 
from  fluid  flow  in  porous  media,  see  [1]. 

In  [11]  an  alternate  problem  was  proposed  so  that  (1)  could  be  solved  via 
successive  approximations.  It  has  the  form 

E  -t-  dt  A(E)  E  =  d 

A(E)  =  [ay  (Pj(Ej)  /  Ej)]  and 
A  =  [ajj]. 


where 


(3) 
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The  following  algomhrn  was  studied  in  111).  There  the  ajj  where  functions  of  E.  but 
for  the  present  we  assume  they  are  constants. 

Algorithm  1.  Contractive  iteration.  Consider  (3).  The  successive 
approximation  is 


Em+1  =  (I  +  At  A(E«n))-l  d  -  G(En>). 

If  At  is  small  enough,  then  p(At  A(E))  <  5  <  1  for  all  E  >  0.  and  consequently, 
one  can  write  (I  At  A(E))'^  in  the  form  of  a  geometric  series.  This  fact  and  some 
technical  arguments  were  used  to  show  that  G  is  a  contraction  on  some  box  in  9t^.  In 
practice  the  constraint  on  At  does  not  seem  to  be  severe.  The  following  is  a  special 
case  of  the  theorem  in  R.  E.  White  [II]. 

Theorem  1.  Consider  problem  (3)  and  Algorithm  1.  If  A  is  an  M-matrix  in  (1) 
and  each  3}  satisfles  (2.1)  •  (2.3).  then  for  suitably  small  At.  G(E)  is  a  contraction  on 
some  box  [cq.  2lldlloe]^. 

In  two  and  three  space  dimensions,  the  solve  step  will  most  likely  be  done  by 
an  iterative  algorithm.  Later  we  shall  focus  on  the  SOR  and  ADI  algorithms. 


§3.  Induced  Contractive  Method.  Let  S  c  9t  ^  and  G:S->S  be 
contractive  with  0  <  c  <  1  and 


|C(£)-C(£)L  Sci^-SL- 

Suppose  G  is  as  in  Theorem  1  where 


S  £  [eo.  2  lidlU]N  and 
G(E)  s  a  +  At  A(E))-1  d. 
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Algorithm  2.  Induced  contractive  iteration.  Lei  1  +  At  A(E)  =  B(E)  -  C(E) 
be  a  splitting  that  defines  a  new  map  G:S  — >  S 

G(E)sE'‘ 

where 

eO=e. 

=  B(E)-1  C(E)  +  B(E)*1  d  and 
I  ^  k  ^  K. 

Theorem  2.  Let  the  assumptions  of  Theorem  1  bold.  If  IIB(E)*^  C(E)lleo  ^8<1 

for  all  E  e  S,  then  there  exists  a  constant  K  >  0  such  that  E"^***!  s  G(E™),  where  E^  c  S 

A 

and  G  is  given  by  Algorithm  2.  converges  to  the  solution  of  £$  -  G(Es)  e  S. 

Proof.  Ut  H  *  B(E)-1  C(E)  and  H  =  B(Er*C(E). 

G(E)  =HEK-l+B-id 

=  hKe+[I  +  ...  +  hK*1jb-1  d 
=  hk  E  +  a  ’  hK)  a  -  h)*i  b-i  d 

=  hK  E  +  (I  -  hK)  (B  (I  -  H))'l  d 
=  hK  E  +  (I  -  hK)  G(E) 

=  hK  (E  -  G(E))  +  G(E) 

If  E  =  Es  =  G(Es),  then  G(E,)  =  E,. 

G(E)  - G(E)  =  hK  (E  -  G(E))  +  G(E)  -  h'^(E -G(E)) - G(E) 

=  hK  (E  -  G(E))  -  h'^(E  -G(E))  +  G(E)  + 

h'^(E-G(E))  -  h'^(E-G(E))  -  G(E) 

=  h'^(E -  E)  -  H’^(G(E) - G(E))  -KKE)  -  G(E)  +  - h'^)(E  - G(E)) 

Ut  E  =  Em  with  =G(E'"),  and  E  =EssG(Es). 

Em+l.Es  =G(Em).G(Es) 

=  HK  (Em  -  Es)  -  HK  (G(Em)  -  CKEg))  +  G(Em)  -  G(Es) 

By  the  assumptions  on  H  and  G,  we  have 

IIEm+1  .  Esiloo  ^  (SK  +  8^  c  +  c)  IIEm  .  Eglloo. 

Since  8,  c  <  1,  we  may  choose  K  so  that  8^  +  gK  ^  +  c  S  (1  +  c)  /  2  =  5  <  1. 

Thus  IIEm+1  .  Eslloo  ^  6  IIEm  -  EsIU  ^  5™'^*  IIEP  -  Esiloo. 


Corollary.  Suppose  p(B(E)*^  C(E))  <  1  for  all  E  e  S.  Then  E"^  converges  to  Es  provided 
K  =  K(m)  is  suitably  large. 

Proof.  Since  p(B(E)*^  C(E))  <  1. 

H(E)K  =  (B(E)-l  C(E))K  0  as  K  oo. 

IIEin+l  .  Esiloo  ^  (IIH(E«n)K|loo  +  IIH(E®)K||oo  c  +  c)  IIE®  -  Ejlloo. 

For  large  enough  K  =  K(m)  we  have 

IIEm+l  .  Esiloo  S  5  IIE“»  -  Esiloo  ^  5®+^  lE^  -  EgHoo.  where  8  =  (1  +  c)  /  2. 


§4.  Nonlinear  SOR.  In  this  section  we  consider  the  SOR  version  of  the 
splitting  in  the  induced  contractive  method  in  Algorithm  2.  If  0  <  cd  ^  1.  then 
Algorithm  2  converges  to  the  solution  of  (I). 

Theorem  3.  Let  assumptions  of  Theorem  1  hold.  Consider  the  SOR  splitting 

1  1-CD 

I  +  At  A(E)  =  (—  a  +  At  D(E))  -  At  L(E))  -  ( - 0  +  At  D(E))  +  At  U(E)) 

(0  OD 

Moreover,  if  At  is  further  restricted  so  that  I  -i-  At  A(E)  is  uniformly  strictly  diagonally 
dominant,  then  we  may  choose  K  independent  of  m. 

Proof.  I  +  At  A(E)  is  an  M-matrix,  and  this  splitting  is  a  weak  regular 

splitting  for  all  E  e  S.  Thus  p(H(E))  <  1,  and  by  the  above  Corollary  E™  must  converge 
to  Es  provided  K  =  K(m)  is  suitably  large. 

If  we  restrict  At  so  that  1  +  At  A(E)  is,  uniformly  with  respect  to  E  e  S,  strictly 
diagonally  dominant,  then  we  can  show  that  the  assumptions  of  Theorem  2  hold.  This 
is  done  by  standard  method  and  is,  for  o  =  1, 

max—'  —  — Bit - ^  8  <  1. 

•  l  +  Arai(£)-A/2^a^(£) 
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§5.  Nonlinear  ADI.  In  this  section  we  consider  the  ADI  version  of  the 
induced  contractive  method  in  Algorithm  2.  Here  the  analysis  is  based  on  M- 
matrices,  as  contrasted  with  SPD  matrices  in  Ortega  and  Rhcinboldt  [4].  Later  this  will 
be  useful  as  the  analysis  does  not  require  symmetric  matrices  for  the  application  to 
Richards'  equation. 

Let  A  =  H  +  V  and  A(E)  =  H(E)  +  V(E)  where  H  and  V  represent  the  grid  rows  and 
grid  columns,  respectively.  H  =  (1/2)  I  +  At  H(E)  and  V  =  (1/2)  I  +  At  V(E)  so  that 
A(E)=  H  +  V. 

Theorem  4.  Let  the  assumptions  of  Theorem  1  hold.  Consider  the  ADI  splitting 

I  +  At  A(E)  =  B(E)  -  C(E) 

where 

B(E)-^  =(al+  V)'l  (I  +  (aI-  H)(al+  H)-*]  and 
C(E)  =  A(E)  -  B(E). 

There  exists  a  constant  ao  such  that  if  a  ^  ao  ^  0,  then  Algorithm  2  converges  to  the 
solution  of  (1). 

Proof.  We  shall  show  that  the  splitting  is  a  weak  regular  splitting  of  the 
M-matrix  I  -t-  At  A(E). 

B(E)*^  =  (a  I  +  V)*l  [I  +  (o  I  -  H)  (a  I  +  H)"!] 

=  (al+  V)-l  ((al+  H)  +  (aI-  H)l(oI+  H)- ^ 

=  (al+  V)-l  (2al)  (oI+  H)-l. 

By  the  conditions  on  P(E)  there  exists  a\  0  such  that  for  a  ^  ai,  a  I  +  V  and  a  I  -f  H 
are  M-matrices.  Thus  B(E)'l  ^  0.  Moreover,  there  exists  a2  ^  0  such  that  for  a  ^a2. 

A  A 

aI*H,  al-V^O.  Thus  for  a  ^  oq  =  max  (ai,  02) 

B(E)-1  C(E)  =  (a  I  +  V)-l  (a  I  -  H)  (a  I  +  H)-l  (o  I  -  V)  S  0. 

By  the  Corollary  to  Theorem  2  Algorithm  2  converges  to  the  solution  of  (1). 


§6.  Numerical  Experiments  for  the  Stefan  Problem.  In  this  section 
we  consider  the  numerical  solution  of  a  one-phase  Stefan  problem  in  two  space 
dimensions  where  the  domain  D  =  (0.1)  x  (0,1)  x  (0,T),  T  =  1/V2.  The  exact  solution  of 


where 


El  -  3(E)xx  *  3(E)yy  =  0 


is 


P(E)  =  <0. 

Ie, 


E<-1 
-1  <E<0 
E>0 


t  -  (1/^/1)  (x  +  y)^0 
\-l.  t-(l/%^)(x  +  y)<0’ 


The  initial  condition  on  E  and  the  Dirichlet  boundary  condition  on  3(E)  are 
implicitly  given  in  the  above  formula  for  E.  The  space  variables  were  discretized  by 
the  traditional  five  point  finite  difference  method  resulting  in  the  nonlinear 
algebraic  system  (1).  We  compared  the  performance  of  the  SOR  version  of  the 
induced  contractive  algorithm,  denoted  simply  as  FDM-SOR,  with  the  ADI  version  of 
the  induced  contractive  algorithm,  denoted  as  FDM-ADI.  The  computations  were  on 
the  vector/multiprocessing  computer  Cray  Y-MP.  The  Cray  was  used  as  a  single 
processor. 

The  parameters  set  for  this  algorithm  specified  the  number  of  nodes  in  each 
direction,  the  acceleration  variables  and  the  error  tolerance.  For  our  experiments, 
the  number  of  nodes  was  the  same  for  each  space  and  time  direction.  Thus,  the  time 
step  size  is  approximately  forty  times  the  maximum  allowable  time  step  size  for  this 
test  problem  and  for  the  explicit  method. 

Both  the  FDM-SOR  and  the  FDM-ADI  methods  showed  the  same  pattern  with 
respect  to  the  number  of  iterations  for  convergence.  The  tolerance  values  were  set 
at  ein  =  lO'^.Cout  =  10'^.  (or  the  inner  and  outer  iterations,  respectively.  The  number 
of  outer  iterations  was  three,  in  all  reflnements  considered  of  a  basic  256  cell  (16  x 
16)  spatial  discretization,  which  we  will  subsequently  refer  to  as  the  basic  problem. 
The  number  of  inner  iterations  can  be  considerably  decreased  by  the  correct  choice 
of  the  acceleration  parameter.  For  the  SOR  version  of  Algorithm  2  we  used  co  =  1.5, 
1.6,  1.7,  and  1.8  for  the  number  of  cells  =  256,  1024,  4096,  and  16,384,  respectively.  For 
the  ADI  version  of  Algorithm  2  we  used  a  »  4.0,  6.0,  8.0,  and  10.0  for  the  number  of  cells 
=  256,  1024,  4096,  and  16,384,  respectively. 
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Our  intent  was  not  to  determine  an  optimal  value  for  the  acceleration 
parameter  but  rather  to  determine  an  appropriate  value  at  which  the  inner 
iterations  are  minimal  for  our  test  problem.  Our  numerical  experiments  showed  that 
the  appropriate  value  of  o)  varied  between  l.S  and  1.8  for  all  refinements  ol  the  basic 
problem.  Our  numerical  experiments  showed  that  the  value  of  the  acceleration 
parameter  a  increases  considerably  with  successive  refinements  of  the  basic 
problem.  Experiments  showed  that  a  can  be  chosen  without  much  consideration, 
which  is  an  advantage  of  the  FDM-ADI  version  of  Algorithm  2. 

Table  1.  CPU  time  (secs)  on  Cray-YMP, 

Algorithm  2/SOR 


als  2  \  cells 

256 

1024 

4096 

serial 

D.19 

vector 

0iO6 

mi 

nHUi 

In  our  experiments  we  used  two  versions,  serial  and  vector,  of  the  codes  that 
ran  on  the  vector/multiprocessing  computers.  The  purpose  here  was  to  determine 
the  degree  of  vectorizability  for  each  method.  Tables  1  and  2  show  the  CPU  times  for 
both  algorithms  to  solve  each  of  the  successive  refinements  of  our  basic  mesh.  In 
the  case  of  FDM-SOR  method  the  traditional  red-black  ordering  of  the  nodes  allows 
effective  vectorization.  In  the  case  of  FDM-ADI  method  we  implemented  the  vector 
version  of  the  basic  tridiagonal  solver. 


Table  2.  CPU  time  (secs)  on  Cray-YMP, 
Algorithm  2/ADI 


Tables  I  and  2  illustrate  that  the  FDM-SOR  method  exhibits  higher  speed-ups  as 
compared  to  those  of  the  FDM-ADI  method.  The  CPU  lime  of  the  vector  FDM-SOR 
method  was  the  smallest.  Some  of  the  larger  CPU  time  for  the  FDM-ADI  method  is 
attributed,  in  this  case,  to  the  computation  and  storing  of  the  components  of  A(E). 

Next  we  considered  a  version  of  the  ADI  method  that  uses  variable  a.  The 
calculations  of  the  variable  a  were  done  using  the  scheme  in  [9]  where  a  =  At  and  b  = 
1/Ax^.  Although  this  scheme  was  designed  for  a  particular  boundary  value  problem, 
it  did  work  well  for  our  nonlinear  problem  and  was  fairly  robust.  Table  3  shows  the 
CPU  times  for  the  vectorized  SOR.  ADI  with  constant  a.  and  ADI  with  variable  a.  As  the 
number  of  unknowns  increases,  the  merits  of  the  ADI  method  with  variable  a  become 
clear. 


Table  3.  CPU  times  for  Algorithm  2 


algNcells 

256 

1024 

4096 

16  384 

SOR 

ADI 

ADI  var  a 

$7.  Fluid  Flow  in  a  Porous  Media:  Richards'  Equation.  Richards 
equation  for  fluid  flow  in  a  porous  media  was  first  formulated  in  1931,  see  [6].  It  is  a 
nonlinear  parabolic  equation  for  the  pressure  head  =  'P  where 

h  =  S'  +  2, 

and  h  is  called  the  hydraulic  head.  The  gravitational  direction  is  given  by  z.  Darcy's 
Law  is  used  to  form 


©(•P)!  -  V  K('P)  Vh  =  0  where 

K  =  hydraulic  conductivity  and  ©  =  moisture  content.  They  are  empirical  functions  of 
4*.  see  [1]  and  [3]  and  Figures  2  and  3 


Figure  2.  K  =  K('F)  =  hydraulic  conductivity 


Figure  3.  6  =  ©(S')  =  moisture  content 

In  references  [1]  and  [3]  it  is  noted  that  the  above  curves  may  differ  according 
to  "wetting"  or  "drying",  and  the  slopes  can  be  large.  The  functions  can  be 
approximated  by  rational  polynomials  of  order  4  or  5.  This  is  done  so  that  traditional 
analytic  methods  can  be  used;  however,  this  introduces  some  approximation  errors 
and  the  functions  are  expensive  to  compute. 


Richards'  Equation. 


1  2 


e('F)l  -  V  K('F)  VH*  =  K(4')z 


Robin  boundary  conditions  arc  typical  and  have  the  form  KC'P)—  given  when  n  is  a 

dn 


unit  outward  normal  to  the  surface. 


In  a  recent  paper  by  Paniconi  et  al.  [S]  several  numerical  methods  were 
described,  and  some  numerical  experiments  in  one  dimension  were  done.  In  the 
cases  were  the  slopes  of  6  and  K  were  large,  numerical  difHculties  were  encountered. 
The  moisture  function  6 (S')  is  analogous  to  the  enthalpy  function  E  which  is  a 
discontinuous  function  of  temperature  in  the  Stefan  problem.  This  suggests  that  one 
may  be  able  to  apply  the  following  moisture  formulation  of  Richards'  equation. 

Let  E  =  6(4*)  and  apply  the  Kirchhoff  transformation  to  the  hydraulic 
conductivity. 

S'  __ 

u  *  PCS')  s  Jk('F)c1¥. 

S'  =  pl(u). 

.  E  =  e(F-l(u)). 

P(E)  =  u  and 

y(E)  =  K(e-1(E)). 

Then  Richards'  equation  can  be  written  in  terms  of  E  where  E  is  the  primary 
unknown  and  represents  the  moisture. 

Moisture  Formulation  of  Richards'  Equation. 

El  -  A  p(E)  =  y(E)z 

Here  the  term  on  the  right  side  makes  this  a  little  more  complicated  than  the 
Stefan  problem.  There  are  essentially  two  cases:  either  the  derivative  of  y  is  not  too 
large,  or  the  derivative  is  large  or  even  a  jump  discontinuity  occurs. 


In  the  first  case  we  may  write  y(E)z  =  y'(E)  E^  where  0  <  y'(E)  <  M  <  oo.  a 
reasonable  implicit  time  discretization,  say  in  one  space  dimension,  is  for  Ei  from  the 
present  time  step  and  for  £,  from  the  previous  time  step 

m  Ax  Ax 

Here  the  coefficient  matrix  is  more  complicated  than  in  (1).  but  it  is  still  an  M-matrix 
and  the  more  general  contractive  algorithm  in  [11]  can  be  used. 

The  second  case  is  when  y'(E)  is  so  large  that  y(E)  appears  to  have  jump 
discontinuities.  This  is  common  once  the  space  variables  have  been  discretized.  A 
special  case  has  been  studied  by  A.  Friedman  [2]  where  existence  of  a  weak  solution  to 
the  continuum  problem  is  established.  There  for  H  =  the  Heavyside  function 

e(*p)  =  a  S'  +  H(S'  -  'Fo). 

K('F)  =  ki  +(k2-ki)H('F-'Fo). 

^0  =  0*  ^2  =  2  and  kj  =  1. 

Then  the  one  space  dimension  form  of  Richards’  equation  is 

(a  u  +  H(u))t  -  uxx  =  H(u)x. 

Let  E  s  a  u  -»■  H(u)  and  P(E)  =  u  so  that  E  =  a  p(E)  +  H(u),  H(u)  =  E  -  a  P(E)  and 

Et  -  Ex  -  P(E)xx  +  o  P(E)x  =  0. 

A  reasonable  discretization  has  the  form  of  (1)  or  (3).  Under  some  constraint  on  Ax 
A(E)  will  be  an  M*matrix,  and  the  theorem  in  [11]  will  be  applicable.  In  this  case. 
A(E)  is  tridiagonal,  and  hence,  there  is  no  need  to  use  an  induced  iterative  method. 
In  higher  space  dimensions  the  induced  iterative  methods  will  be  more  desirable. 


1  4 

The  above  approximation  of  the  empirical  functions  is  ver>  restrictive.  In  jTj 
and  [8]  more  accurate  approximations  are  used  so  that  diffusion  of  the  fluid  in  a 

partially  saturated  medium  can  be  calculated.  A  slightly  less  accurate  approximation 
which  will  track  a  wet-dry  interface  is  as  follows.  Here  we  have  set  l/o  =  0- 

\ajVf+02. 

and 

„  ^<0 
'W  ¥^0' 


In  our  calculations  we  shifted  the  horizontal  axis  to  the  right  so  that  6  and  K  have 
the  graphs  given  in  Figures  4  and  S.  We  used  the  K(0)  form  of  the  hydraulic 
conductivity  because  it  is  continuous. 


Figure  4.  6(u) 
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E  =  e, 


PiE)  = 


and 


K{E)^ 


a 


^£.  E<e. 
K 


a 


*1 

^(£-fl,)+fe,.  e,<E 

[kj  K 


«2-^. 

ikj,  02  <  £. 


For  the  one  space  variable  problem  we  have 

E,-0iE)„-KiE),=Q  ,0^xSU>0 
withboundary  conditions 
i3(£), +/:(£)  =  /?  ,JC  =  U>0 
^(£),  +  A’(£)  =  0  .ac*0,f>0 
and  initial  condition 
£  =  0,  ,0^x^Ut-0. 


The  discretization  for  the  interior  cells  is 


^^+-4(-P(£,.,)+2«£<)-P{E,.,))-i(Ar(£„, )-£(£, ))  =  0. 

A/  Ax  Ax 


The  matrix  in  (3)  is  formed  in  a  similar  way.  and  for  the  interior  cells 


1  .^PiEA  ^  K(EX  ^ 


’i+I 


■'i+1 


The  two  end  cells  incorporate  the  boundary  conditions  in  the  standard  way. 


In  the  following  calculations  we  used  the  parameters  that  reflect  Brindabclla 


loam: 

0,  =.l  1  =  residual  moisture 
6^  =.27  =  from  the  moisture  data 
62  =.485  =  saturated  moisture 
ik,=10*^ 
k2  =.022 

R  =  0.0165[m  /hr]  =  flow  from  top 


No  flow  from  sides  or  bottom 
Af  =.  1 25[ hr]  with  40  time  steps 
Ax  =.300  /  20[m]  with  20  grid  cells 
Q)  =  1.2  the  SOR  parameter 

=  10“*  absolute  error  for  inner  SOR 
=  10*^  absolute  error  for  outer. 


On  the  average  3  or  4  inner  iterations  and  5  to  8  outer  iterations  were  required 
for  convergence.  Figures  6a  and  6b  indicate  the  computed  and  observed  moistures. 
In  Figure  6b  the  diamonds,  triangles  and  squares  represent  the  observed  data  at  the 
times  1.0,  2.25  and  5.0  hours,  respectively.  Note  the  agreement  as  the  front 
progresses  from  the  top  (left)  to  the  bottom  (right).  Once  the  bottom  starts  to  fill  the 
hydraulic  conductivity  increases  and  the  diffusion  becomes  more  important.  In  [7] 
the  later  stages  of  this  are  modelled,  but  the  calculations  are  more  costly  than  in  the 
above  model. 


omoioomoinoiooiooiooiooioowo 

f-co'»«>r^o>oevic*jioh»o»0'*-eo'vior^ooo 

f-^'»-'*-T-'r-c>icvcvoioicveveo 


time>l.O 

tlme-2.25 

time-5.0 


depth 


Figure  6a.  Computed  Moisture 
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Figure  6b.  Observed  Moisture 


§8.  Conclusions.  We  have  given  an  analysis  of  induced  contractive 
methods  for  free  boundary  value  problems.  The  SOR  version  and  the  ADI  version  of 
Algorithm  2  seemed  to  perform  equally  well.  However,  the  SOR  version  was  very 
sensitive  to  the  choice  of  SOR  parameter,  and  this  contrasts  with  the  ADI  method 
which  was  observed  to  be  fairly  robust  with  respect  to  its  acceleration  parmater. 

Both  the  ADI  and  SOR  versions  vectorize  very  well.  In  3D  problems  with 
complicated  nonlinear  terms  we  expect  the  ADI  version  to  be  more  robust  and  to 
perform  equally  well  as  with  the  SOR  version. 

The  numerical  examples  were  given  for  the  two  space  variable  Stefan 
problem.  Application  of  these  methods  to  Richards'  equation  was  outlined.  For  a  one 
space  variable  flow  through  the  Brindabella  loam  we  showed  that  the  computed  and 
observed  moisture  were  in  good  agreement. 
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ATTACHMENT  #2  Heterogeneous  Diffusion  and  the  Compact 
Voiume  Method 


Heterogeneous  Diffusion 
and  the 

Compact  Volume  Method^ 


by 


R.  E.  Whited  B.  N.  Borah^,  A.  J.  Kyrillidis' 


Abstract.  In  this  paper  we  study  heterogeneous  diffusion  in  the  context  of  heat 
induction  in  laminated  material,  heat  ctmdocrion  with  phase  change  (die  Stefan  problem)  and  fluid 
flow  in  porous  media  (Richards'  equation).  We  discretize  these  problems  via  die  ctnnpact  volume 
method  (CVM).  We  illustrate  that  this  method  is  higher  order  accurate  in  the  flux  error  than 
traditional  methods.  The  resulting  nonlinear  algebraic  systems  are  iqiproximated  by  versions  of  the 
SOR  and  ADI  schemes.  Vector  and  multiprocessing  implementations  are  presented,  and  they 
indicate  that  these  methods  are  suitable  for  high  performance  conqniting. 


^  Department  of  Mathematics,  North  Carolina  State  University 
^  Department  of  Mathematics,  NorUi  Carolina  A&T  State  Univerrity 
^  This  research  was  sponsored  by  the  Aimy  Research  Office,  RTF,  NC.  Contract  No.  DAAL03-90-G-0126 


§  1 .  Introduction.  In  this  paper  we  consider  diffusion  problems  in 
heterogeneous  materials.  We  use  the  compact  volume  method  (CVM),  which  was  introduced  in 
M.  E.  Rose  [5],  to  discretize  the  governing  differential  equations.  The  advantage  of  CVM, 
relative  to  the  traditional  methods  such  as  finite  differences,  finite  elements  or  boundary  element 
methods,  is  that  CVM  is  higher  order,  in  both  the  solution  and  the  flux  errors,  accurate.  Here  we 
illustrate  quadratic  convergence  in  errors  in  ID  problems  related  to  heat  conduction  in  a  laminate, 
heat  conduction  with  phase  change,  and  fluid  flow  in  a  heterogeneous  porous  media. 

We  do  ncx  give  a  theoretical  analysis  of  the  convergence.  However,  the  resulting  algebraic 
systems  are  more  carefully  studied.  Some  iterative  methods  are  described,  and  criteria  for  their 
convergence  are  given.  Vectorization  and  multii^ocessing  issues  are  discussed. 

In  section  two  we  describe  the  CVM  for  linear  heat  conduction  problems  with  discontinuity 
in  the  thermal  conductivity.  Sections  three  and  four  are  applications  to  the  Stefan  problem.  In 
section  four  we  describe  the  CVM  with  the  heat  balance  equation  at  the  solid-liquid  interface  and 
develop  a  scheme  which  ccmveiges  quadratically  for  the  solid-liquid  interface,  the  solution  and  the 
flux  errors.  The  fifth  section  is  an  application  to  Richards'  equation  in  ID  where  the  hydraulic 
conductivity  is  piecewise  linear  in  the  solution  and  has  a  jump  discontinuity  in  the  space  variable. 


§  2 .  CVM  for  Linear  Problems.  In  this  section  we  review  the  general  form  of 
CVM  as  presented  in  M.  E.  Rose  [5].  For  problems  of  the  form  -Uxx  =  f  Rose  proved  quadratic 
convergence  in  both  u  and  Ux.  In  order  to  establish  the  general  form,  consider  the  following 
example; 


-Uxx  =  f  and  u(0)  =  0  =  u(l). 

I  ■  ^  ■  4  ■  I  ■  ^ 

Ui  U2  U3  U4  Us  U6  U7 

We  partition  the  [0, 1]  interval  into  4  cells.  The  even  nodes  are  inserted  to  insure  continuity 
of  the  flux  at  the  cell  boundaries.  Each  cell  has  length  equal  to  2  h. 
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i  =  odd; 


h  h 


hfi 


and 


i  =  even: 


h  ^  h 


The  resulting  algebraic  system  has  the  following  form  when  u  =  [Uodd>  Ueven]^  where  Uodd  is  the 
vector  of  odd  nodes  and  Ueven  is  the  vector  of  even  nodes. 


A 
■^21 
where 


C.,= 


1 

2h^ 


1 

1  1 
1  1 
1 


4  ***[/,  h  h  /X  ^*[00  of. 


In  this  case  the  coefficient  matrix  is  irredocibly  diagonally  dominant,  an  M-matrix  and  also 
symmetric  positive  definite.  Therefore  iterative  methods  such  as  SOR  and  ADI  can  be  used. 
Moreover,  when  ordering  the  nodes  as  even  or  odd,  the  matrix  has  the  form  as  in  the  red-black 
ordering  for  the  FDM.  Consequently,  vector  pipdines  can  be  effectively  used. 

The  coefficient  matrix  has  a  block  structure  which  makes  it  easy  to  use  block  Gauss 
elimination.  For  example,  in  the  above  we  may  eliminate  Uodd  to  get 


where 


[D2  -  C21  Dr*  C12]  Ueven  *  d2  +  C21  Dr*  dl 


D2  -  C21  Dr*  C12  =  ^ 


2  -1 
-1  2  -1 
-1  2 


(2) 


The  following  theorem  is  a  well  known  generalization  of  the  above. 
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Theorem  1.  Consider  the  algebraic  system  in  (1).  If  Di  and  D2  -  C21  Df’  C12  are  nonsingular, 
then  (1)  has  a  solution  and  it  is  given  by  (2)  and  Di  UoUd  =  di  +  Cn  Ueven- 

The  next  example  illustrates  CVM  where  the  media  is  heterogeneous,  that  is,  the  coefficient 
in  the  differential  equation  has  a  jump  discontinuity  with  respect  to  the  space  variable. 

Example  1.  Heat  Conduction  in  a  ID  Laminate. 

-(k(x)  Ux)x  =  10  x(l  -  x). 


u(0)  =  0  =  u(l)  and 


x<0.5 
X  ^  0.5  ■ 


The  solution  is  formed  by  requiring  u  and  k  Ux  to  be  continuous  at  x  =  0.5.  Both  the  CVM 
and  the  FDM  were  used.  The  FDM  has  n  ceUs  and  n  - 1  unknowns  and  the  CVM  has  n  /  2  cells 
and  n  •  1  unknowns.  Tables  1  and  2  illustrate,  for  this  example,  that  the  FDM  has  both  function 
and  derivative  errors  of  order  h  while  the  CVM  has  both  errors  of  order  h^. 


Table  1.  FDM  for  Heat  Conduction  in  Laminate 


n  \  error 

Fct.  Error 

Flux  Error 

8 

0.002456 

ai59  566 

16 

0.001227 

0.081413 

32 

0.000750 

0.041091 

64 

0.000495 

0.020629 

128 

0.000273 

0.010342 

Table  2.  CVM  for  Heat  Conduction  in  Laminate 


n\  error 

Fct.  Error 

Flux  Error 

8 

0.014448 

0,017367 

16 

0.003717 

0J00596% 

32 

0.000939 

0,001692 

64 

0.000235 

0.000445 

128 

0.000059 

0.00dll4 

The  CVM  generalizes  to  the  2D  domain  in  the  obvious  way  when  a  rectangular  grid  is 
used.  If  an  irregular  2D  domain  is  required,  then  one  may  use  a  variation  of  the  boundary  element 
method  (see  E.  K.  Bruch  [1])  and  finite  element  method  (see  R.  E.  White  [10]).  For  example,  if 
we  wish  to  use  triangular  elements  and  linear  shape  functions  on  -V-k  Vu  =  f,  then  we  can  insert 
an  additional  center  node  (node  o)  in  each  element  (These  are  analogous  to  the  odd  nodes  in  the  ID 
case),  see  Figure  1. 
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Figure  1.  CVM  for  2D  Element 

First,  we  apply  the  boundary  element  idea  to  each  element  and  solve  for  u  at  each  center 
node.  This  amounts  to  applying  the  Diveigence  Theorem  to  each  element  Let  q  » -k  Vu.  Then 
integrate  V’q=  f  over  e 

By  the  Divergence  Theorem 

Here  we  assume  u  is  known  at  the  nodes  1, 2  and  3.  The  u  can  be  expressed  as  linear  function 
above  each  subelement  ei,  e2  and  es.  Thus,  the  above  boundary  element  equation  has  just  u  at  the 
center  nodes  as  the  only  unknown. 

Second,  we  form  the  equations  for  each  unknown  at  the  vertex  nodes  of  every  element 
(These  are  analogous  to  the  even  nodes  in  the  ID  case).  This  is  done  by  requiring  q  to  be 
continuous  at  each  edge  of  every  element.  This  takes  the  form  of  3  simultaneous  equations  for 
every  element,  and  these  are  easily  solved.  One  can  also  combine  the  four  equations  and 
simultaneously  solve  for  the  four  unknowns  in  every  element.  In  either  case  the  resulting  system, 
via  assembly  elements,  can  be  solved  directly  or  iteratively. 
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§3.  Application  to  the  Stefan  Problem, 
generates  nonlinear  algebraic  problems  of  the  form 


The  FDM  when  applied  to  free  BVPs 


E  +  AtAP(E)  =  d 

where  A  is  a  matrix  associated  with  the  elliptic  part.  The  compact  volume  method,  CVM,  as 
described  in  [6, 7, 1 1]  generates  a  larger  system  of  the  form 


At^  Dj  -Ci2l[P(E)l_  dj 

0  ["^21  D2  j[  u  J  [d2j‘ 


where  D^,  D2  are  diagonal  matrices,  C12  and  C21  are  rectangular  matrices.  P(E)  is  as  in  the  FDM 
and  is  the  temperature  at  the  center  of  the  cells.  And  u  is  the  temperature  at  the  boundary  of  the 
cells.  The  diffusion  equation  fw  each  cell  (i  =  odd)  is 

Ei  1  fui+,-p(Ei)  p(Ei)-Ui_il  ^ 

S-2h  - h - h - 


The  flux  continuity  equaticm  at  cell  boundary  (i  =  even)  is 


1  [P(Ei+i)-Ui  Ui-P(Ei_i) 


Equation  (3)  should  be  viewed  as  a  first  version  of  the  CVM  method.  We  will  want  to  write  it  in 
the  alternate  form,  as  in  the  FDM, 


-+Di(E)  -C12 

-C2i(E)  D2 


P(Ei)l  3(E)  3(E) 

where  Di(E)  =  diag  dj-^— =Di  and  C2i(E)  =  C21 


In  two  and  three  dimensions  the  problem  is  more  conq>licated,  but  it  appears  that  ADI  schemes  will 
be  useful  in  approximating  the  solution  to  such  problems. 


Consider  the  simplified  problem  (6).  If  D2  is  nonsingular,  then  we  may  solve  for 


Then 


Or. 


u  =  D2‘[d2  +  C2i(E)E]. 

—  -  C12  D2’  C2i(E)  +  Di(E)|  E  =  di  +  C12  D2  d2  and 
1  +  (Di  -  C12  Di  C21)  E  =  d,  +  C12  D2'  d2 . 

^  +  (Di-C,2Di^C2i)P(E)  =  d. 


(7) 


Thus,  the  reduced  problem  in  (7)  has  the  form 

;|+Ap(E)  =  3 


where  A  is  a  symmetric  M-matrix.  Provided  A  and  d  have  been  computed,  SOR  and  ADI  methods 
as  described  in  section  6.4  and  12.1  in  [3J,  will  be  applicable. 

A  more  direct  approach  is  to  consider  the  initial  problem  in  (6).  Algorithms  1  and  2  are 
contractive  ^  in  the  discretization  given  for  the  FDM,  see  [1 1]. 


Algorithm  1.  Contractive. 


i+D,0  -C,j 

-1 

finH-l 

d, 

uiiM-1 

-C2i(E"*)  D2 

d2 

(8) 


Theorem  2.  If 


Dj  --C12 
-C21  D2 


is  an  M-matrix,  then  for  suitably  small  At,  Algorithm  1  converges 


to  the  unique  solution  of  (6). 


Algorithm  2. 


Induced  Contractive. 


Let 


i+D,(E)  -C,2 

-C2i(E)  D2 


=  B(E)  -  C(E)  be  a  splitting  of  the  matrix  in  (6). 


The  next  theorem  is  a  direct  application  of  the  results  in  [1 1]. 

Theorem  3.  If  p(B(E)'l  C(E))  :S  8  <  1  and  the  assumptions  of  Theorem  1  holds,  then  for  K 
suitably  large  Algorithm  2  converges  to  the  solution  of  (6). 

Example  2.  One  phase  Stefan  Problem  in  2D  and  Algorithm  2. 


In  this  example  we  use  Algorithm  2  to  compute  the  numerical  solution  of  a  one-phase 
Stefan  problem  in  two  space  dimensions  where  the  domain  D  =  (0,1)  x  (0,1)  x  (0,T),  T  =  I/V2 . 
The  exact  solution  of 


where 


is 


Et-P(E)xx-P(E)yy  =  0 


fE+1, 

P(E)  =  |o, 

Ie, 


E<-1 

-1^E:S0 

E>0 


E  =  ( 


r  et-(lA/7)(x  +  y)_ 


l-l. 


t- (1/^/2)  (x  +  y)^0 
t-(l/y2)(x  +  y)<0' 


1  0 


The  SOR  splitting.  Note  this  is  vectorizable  because  Dj  and  D2  are  diagonal  matrices.  The 
block  structure  in  (6)  reflects  the  red-black  ordering.  Vectorization  is  easily  implemented. 

The  ADI  splitting.  Here  the  nodes  in  (6)  must  be  reordered  to  reflect  the  classical  order.  For 
the  above  example  the  unknowns  will  be  ordered  as 


[El,  U2,  E3,  U4,  E5,  U6,  E7]T 


This  gives  collections  of  independent  tridiagonal  systems,  and  hence,  vector  versions  of  tridiagonal 
solvers  can  be  used. 

This  example  illustrates  the  SOR  and  ADI  splitting  of  Algorithm  2  on  the  above  Stefan 
problem.  We  compared  the  performance  of  the  SOR  version  of  Algorithm  2,  which  we  will  denote 
simply  as  CVM-SOR,  with  the  ADI  version  of  Algorithm  2,  denoted  as  CVM-ADI.  The 
computations  were  done  on  the  Cray  Y-MP.  The  parameters  set  for  Algorithm  2  specified  the 
number  of  nodes  in  each  direction,  the  acceleration  variables  and  the  tolerance.  For  our 
experiments,  the  number  of  nodes  was  the  same  for  each  space  and  time  direction.  Thus,  the  time 
step  size  is  approximately  forty  times  the  maximum  allowable  explicit  time  step  size  for  this  test 
problem. 

The  tolerance  values  were  set  at  eo,  =  10‘^t  Cout  =  for  the  inner  and  outer  iterations, 
respectively.  We  considered  four  refinements  of  the  spatial  discretization,  and  these  were  for  n  = 
16,  32, 64,  and  128  cells  in  each  direction.  Our  intent  was  not  to  determine  an  optimal  value  for 
the  acceleration  parameters  but  rather  to  determine  an  appropriate  value  at  which  the  inner  iterations 
are  minimal  for  our  test  problems.  For  the  CVM-SOR  method  numerical  experiments  showed  that 
the  appropriate  values  of  (o  were  1.5,  1.6, 1.7,  and  1.8  for  n  =  16,  32,  64  and  128,  respectively. 
For  the  CVM-ADI  method  numerical  experiments  showed  appropriate  values  of  acceleration 
parameter  a  were  4.0,  6.0,  8.0  and  10.0  for  n  =  16, 32, 64,  and  128,  respectively.  Experiments 
showed  that  a  can  be  chosen  without  much  consideration,  which  is  an  advantage  of  the  CVM-ADI 
version  of  Algorithm  2.  We  also  considered  a  version  of  CVM-ADI  that  uses  a  more  sophisticated 
acceleration  scheme  with  variable  otj. 


Table  3.  CPU  time  (secs)  on  Cray  Y-MP,  for  CVM-SOR 


wmm 

n  =  16 

<N 

II 

c 

n  =  64 

n  =  128 

serial 

0.06 

0.76 

10.27 

125.58 

vector 

0.02 

0.12 

1.14 

12.11 

In  our  experiments  we  used  two  versions,  serial  and  vector,  of  the  codes  that  ran  on  the 
Cray.  The  purpose  here  was  to  determine  the  degree  of  vectorizability  in  each  method.  Tables  3 
and  4  show  the  CPU  time  both  algorithms  took  to  solve  each  of  the  successive  refinements  of  our 
basic  uesh.  In  the  case  of  CVM-SOR  method  the  block  structure  of  the  problem  reflects  the 
tradiaonal  red-black  ordering  of  the  nodes  and  thus  allows  for  effective  vectorization.  In  the  case 
of  CVM-ADI  method  we  implemented  the  vector  version  of  the  basic  tridiagonal  solver. 

Table  4.  CPU  time  (secs)  on  Oay  Y-MP,  for  CVM-ADI 


wmsm 

n  =  16 

n  =  32 

n  =  64 

n=  128 

serial 

0.31 

2.S8 

27,12 

254.21 

vector 

0,13 

0.77 

6.28 

50.55 

Qearly  the  CVM-SOR  method  exhibits  higher  speedups  as  compared  to  those  of  the  CVM- 
ADI  method.  Both  methods  exhibit  good  accuracy.  As  noted  above  we  also  considered  a  version 
of  CVM-ADI  that  uses  variable  acceleration  parameters,  Oj.  Table  5  shows  the  CPU  time  of  the 
vectorized  CVM-SOR,  CVM-ADI  and  the  CVM-ADI  with  variable  ttj’s,  denoted  as  ADIv.  Here 
we  see  that  the  variable  a  version  of  (TVM-ADI  is  competitive  with  CVM-SOR,  and  the  CVM- 
ADIv  seems  to  be  more  robust  than  the  CVM-SOR.  These  results  are  consistent  with  those  in  [1 1] 
where  the  FDM  is  considered. 

Table  5.  CPU  times  of  the  Versions  of  Algorithm  2 


n=  16 

9 

II 

n  =  64 

n  =  128 

SOR 

0.02 

1.14 

12.11 

ADI 

0,13 

0.77 

6.28 

50.55 

ADIv 

0.05 

0.27 

1.43 

iO.I2 

The  single  sweep  ADI  method  also  works  quite  well  for  two  dimensional  CVM 
discretizations  of  the  form  (6).  Let 


D,  -Ci2 

-C21  D2 


=  H  +  V, 


I 

2At 

0 


0 


and 


0 

0 


+  V. 


H(E)  and  V(E)  are  consistent  with  the  notation  in  (6).  Then  we  can  write 


^Di(E)  -C12 
-C2i(E)  D2 


*a+H(E))-a-v(E))=a+v(E))-a-H(E)). 


From  here  one  can  use  the  two  step  ADI  sweeps  (See  Rose  [6]),  or  implicit  single  ADI  sweeps  as 
follows. 

Algorithm  3.  Single  ADI  sweep. 

Consider  the  time  dependent  CVM  method  where  a  sequence  of  algebraic  problems  of  the 
form  (6)  are  considered.  Let  n  =  time  step.  The  ^gle  sweep  ADI  method  is  for  E  =  [E,  u]*^ 

(a  I  +  H(E"+i/2))  En+1/2  =  (a  I  -  VfE"))  +  d  and  (10. 1 ) 

(a  I  +  V(En+i))  =  (a  I  -  H(E"+i/2))  e«»+i/2  +  d.  (10.2) 


The  nonlinear  problems  (10.1)  and  (10.2)  may  be  approximated  via  the  contractive 


mapping;  for  example,  in  ( 10. 1 )  with  d  the  right  side  of  ( 10. 1 ) 


§'‘^1  =  (a  I  +  d. 


Here  a  must  be  suitably  large  and  can  be  used  adaptively  to  obtain  optimal  convergence. 

Example  3.  One  phase  Stefan  problem  and  Algorithm  3. 

This  example  illustrates  Algorithm  3  and  the  use  of  vectOT/multiprocessing  computers. 
Consider  the  above  2D  one  phase  Stefan  problem.  The  computations  in  Table  6  were  done  on  an 
Alliant  FX/40  and  Cray  Y-MP.  The  Alliant  FX/40  has  two  processors  each  with  vector  pipelines 
and  170  nsec  clock  cycle  time,  and  the  Cray  Y-MP  was  used  with  one  CPU,  vector  pipeline  and  4 
nsec  clock. 

The  CVM  space  discretization  allows  elective  use  of  independent  tridiagonal  solvers  in 
either  the  form  of  vectorized  tridiagonal  or  multiprocessing  tridiagonal  algorithms.  In  the  case  of 
the  Alliant  FX/40  both  versions  were  used  for  the  (2  epu,  vector)  calculations.  Table  6  records  the 
computing  times  (sec)  for  n  cells  in  each  directitm  and  live  different  computer  cemfigurations. 


Table  6.  ADI  and  CPU  Times 


computerVn 

n=  16 

n  =  32 

n  =  64 

Alliant  (serial) 

nt 

15  ^ 

'1±L28  " 

Alliant  (vectOT) 

l.vZ 

5.98 

38.63 

Alliant  (2  epu,  vector) 

3.32 

21.07 

Cray  (serial) 

0.129  . 

A 

0.89 

'6.52^  ^ 

Cray  (vector) 

032 

1.72 

§4.  Application  to  the  Stefan  Problem:  Higher  Order  Approximation. 

A  more  complicated  problem,  in  one  dimension,  is  the  combination  of  (6),  (1 1),  (12)  and  (13). 
Here  we  will  use  an  nonlinear  SOR  scheme  and  use  the  generalization  of  M-matrices  to  M- 
functions  as  described  in  section  13.5  of  (3).  More  importantly,  we  will  want  to  consider  the 
location  of  the  solid-liquid  interface;  here  the  flux  is  not  continuous  as  in  (5). 


1  4 

Suppose  that  at  some  iteration  we  observe  PCEj)  >  Uf  and  ^(£2)  <  Uf.  Between  Xj  and  x, 
there  must  exist  an  s  such  that  P(E  at  s)  =  Uf ,  see  Figure  2. 


Figure  2.  Phase  Change 


Thus  (5)  must  be  modified  in  three  ways. 

At  xj.we  are  in  a  liquid  cell  [xq,  s]  and 

Ei^ _ Uf-P(Ei)  P(E|)-Uo 

Ats-xo  s-Xi  h 


(11) 


At  s,  we  have  the  classic  heat  balance  equatioi  and 


-L,  -X  f“f-P(Ei)  P(E2)-Uf' 
At  *  s-xj  X2-S 


At  X2,  we  are  in  a  solid  cell  [s,  X3]  and 


E2  1 

U3-  ^(£2) 

P(E2)-Uf 

At  X3  -  s 

h 

X2-S 

(12) 


(13) 


Definition:  The  CVM  for  the  one  dimensional  free  BVP  has  the  form  of  (6)  supplemented  with 
equations  (11),  (12)  and  (13). 
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being  a  M-matrix  or  a  SPD  matrix.  For  the  M-matrix  case  the  system  consisting  of  (6)  coupled 
with  (11),  (12)  and  (13)  is  most  likely  an  M-function,  see  [3].  Consequently,  nonlinear  SOR 
methods  are  applicable.  Indeed,  careful  inspection  of  Figure  2  gives  the  off-diagonal  antitone  and 
the  diagonally  isotone  conditions  for  the  component  of  the  system  for  the  unknown  interface.  The 
other  nodes  have  equations  which  are  similar  to  the  reduced  enthalpy  formulation  of  the  Stefan 
problem,  and  the  reduced  system  is  known  to  be  an  M-funcdons. 

Algorithm  4.  Enhanced  SOR  for  two  phase  Stefan  problem. 

Consider  (6),  (1 1),  (12)  and  (13)  where  the  nodes  i  =  1  and  2  represent  a  possible  change  in  phase 
at  some  point  in  the  iteration,  require  |s  -  J|  ^  /i  so  that  we  must  chose  the  center  root  of  (12): 

do  an  nonlinear  SOR  sweep  of  (6) 
solve  (12)  for  the  center  root 
solve  (11)  and  (13) 
repeat  until  ccmvergence. 

The  advantage  of  the  more  general  CVM  problem  (6),  (1 1),  (12)  and  (13)  is  that  the  order 
of  convergence  is  higher  than  (6),  or  the  FDM.  This  has  been  observed  for  a  number  of 
experiments  for  one  dimensional  problems.  The  extension  to  two  or  three  space  dimensions  will 
be  a  little  more  challenging.  It  appears  that  the  solid-liquid  interface  condition  in  higher  dimensions 
can  be  decoupled  into  equations  similar  to  (11),  (12),  and  (13).  This  suggests  that  a  nonlinear 
ADI  scheme  can  be  developed.  In  each  directicm  a  SOR  algorithm  similar  to  Algorithm  4  can  be 
used.  Thus,  we  hope  for  two  and  three  space  dimensions,  to  be  able  to  obtain  higher  order 
convergence  of  the  discrete  problem  to  the  continuum  problem,  and  to  be  able  to  more  accurately 
locate  the  solid-liquid  interface.  The  above  schemes  have  a  large  portion  of  independent  parts,  and 
hence,  high  performance  computing  can  be  used. 

Example  4.  Two  phase  Stefan  problem  and  solid-liquid  interface. 


E.  -  P(E)„  =  0 


P(E)  = 


(E, 

•  1. 

l(E-2)-i-l, 


where 


E<1 

1^E^2. 

E>2 


The  boundary  and  initial  conditions  were  chosen  so  that  the  following  was  the  solution 


E  = 


X  <t 

x^t' 


The  solid-liquid  interface  is  given  by  s(t)  =  t.  In  the  calculation  we  started  at  t  =  0.2  and 
ended  at  t  s  0.37.  Over  this  time  interval  the  average  function  and  flux  errors  were  tabulated  as 
well  as  the  approximate  value  of  the  solid-liquid  interface.  Tables  7  illustrates  quadratic  (Ax^) 
convergence  for  At  =  Ax^  and  this  example,  in  the  solid-liquid  interface  enor  and  the  function 
error.  The  ctmvergence  of  the  derivative  error  was  at  least  linear  (Ax)  in  all  cases. 

Table  7.  CVM  for  Stefan  Problem  with  At »  Ax^ 


nXcrror 


s  Error 


Fcl  Error 


Der.  Error 


10 


0X103900 


wmmmjmmfrim.. 

«■  ^4'“  ••  ® 


0. 


0.030100 


20 


0.001000 


40 


a000250  ! 


i 


80 


o: 


160 


Even  if  the  criteria  on  At  is  relaxed,  the  errors  remain  relatively  small  as  is  illustrated  by 
Tables. 


Table  8.  CVM  for  Stefan  Problem  with  At  0. 1  Ax 


nXerror 

s  Error 

Fct  Error 

Der.  Error 

10 

0.003900 

0.007000  - 

20 

0.001300 

40 

0.000400 

D.00D682 

omsm 

80 

0,000170' 

a<)«B47  .  ,  . 

160 

0.0(10075 

00(X)0^ 

§5.  Application  to  Richards'  equation.  Richard's  equation  describes  fluid 
flow  in  a  porous  media  and  was  initially  formulated  in  L.  A.  Richards  in  1931  [4],  Presently,  it  is 
being  extensively  used  in  ground  water  modeling  and  in  contamination  modeling,  see  R.  A.  Freeze 
and  J.  A.  Cherry  [2].  This  equation  has  strong  nonlinear  terms  in  the  empirical  functions  for 
moisture  and  hydraulic  conductivity,  see  [11].  Moreover,  the  porous  media  are  often 
heterogeneous;  that  is,  they  are  dependent  on  space  position  and  often  have  discontinuities  in  the 
space  variable.  A  simple  ID  example  of  the  steady  state  Richards'  equation  is  as  follows  where  u 
is  the  pressure  and  k(x,u)  is  the  hydraulic  conductivity. 


Example  5. 


where 


Richards'  Equation  in  ID. 
-(k(x,u)  Ux  +  k(x,u))x  =  f(x) 


u(0)  =  0,  u(l)  =  1  +  ai  +  ao  and 


k(x,u) 


(3  u  +  0.5, 
\u+l. 


x<0.5 
x  S  0.5  * 


The  right  side  was  chosen  so  that 


u(x)  = 


x^  +  ai  X  +  ao. 


x<0.5 

x^O.5 


was  a  solution.  The  constants  ai  and  ao  were  chosen  so  that  u  and  k  Ux  +  k  were  continuous  at  x  = 
0.5.  The  CVM  has  the  form 

i »  odd:  -  ^.-i  ]  =  2hf, 

where  kj*,  =k(x,i, e>0  and 

i »  even:  k*  +  k*  =  k ”  +  k~ 

h  h 

where  k*  =k{x±e,u^,  £>0. 


The  e  >  0  are  used  to  insure  the  hydraulic  conductivity  is  taken  from  the  appropriate  cell. 


Table  9. 


CVM  for  Richards'  Equation 


n\  error 

Fct.  Error 

k  Ux  +  k  Error 

8 

0.05583 

0.09962 

16 

0.01861 

0.05490 

32 

0.00525 

0.01950 

64 

0.00139 

0,00572 

128 

0.00036 

0.00153 

In  the  calculations  recorded  in  Table  9,  n  /  2  are  the  number  of  cells,  and  there  are  n  •  1 
unknowns.  Nonlinear  SOR  was  used  with  the  hydraulic  conductivity  being  updated  as  soon  as 
any  variables  were  computed.  The  function  and  flux  =  k  u^  k  errm^  were  the  max  norm  at  the 
center  of  the  cells  and  cell  boundaries,  respectively.  Inspection,  of  the  table  entries  indicate  the 
CVM  has  quadratic  convergence  in  both  the  function  and  flux  errors. 

The  above  example  illustrates  a  fluid  flow  in  a  porous  medium  which  has  two  layers  where 
the  pressure  does  not  vary  over  a  large  range.  Hence  the  hydraulic  conductivity  is  a  piecewise 
linear  function  of  position  and  pressure.  Extension  to  the  time  dependent  2D  problem  where  the 
empirical  functions  depend  only  on  the  pressure  are  given  in  [1 1]  and  its  references.  These  results 
combine  to  suggest  that  the  general  problem  can  be  solved  in  the  present  context 


§6.  Conclusions.  We  have  shown  for  a  number  of  heterogeneous  diffusion 
problems  that  the  CVM  discretization  method  gives  higher  order  error  estimates  for  both  the 
function  and  flux  errors.  Moreover,  we  have  illustrated  that  both  the  ADI  and  SOR  algorithms  can 
be  adapted  to  solve  the  resulting  nonlinear  algebraic  systems.  Vectorization  and  multiprocessing 
computers  can  be  use  to  effectively  execute  these  algorithms. 

In  the  applications  to  the  heat  conduction  in  a  laminate  material,  the  Stefan  problem,  and 
Richards'  equation  we  have  indicated  how  one  can  extend  the  CVM  to  2D  and  3D  space  problems. 
All  these  problems  are  of  current  interest  to  physical  scientists  and  engineers.  Moreover,  the 
mathematical  analysis  of  convergence  for  the  more  complicated  versions  of  CVM  is  certainly 
needed. 
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ABSTRACT 

This  paper  describes  an  SOR  algorithm  for  solving  the  nonlinear  algd>raic  system  which 
evolves  from  Richards'  equation  that  models  fluid  flow  in  a  porous  media.  The  moisture 
content  and  hydraulic  conductivity  functions  are  approximated  by  piecewise  linear 
hmctions  obtained  from  field  data.  The  resulting  algd>raic  system  is  solved  by  a  variation 
of  the  nonlinear  SOR  algorithm.  The  advantage  of  this  approach  is  that  it  avoids  some  of 
the  numerical  oscillations  associated  with  large  derivatives  in  the  data.  Numerical 
calculations  are  presented  and  illustrate  the  following:  (i)  agreemem  of  the  numerical 
model  with  observed  data,  (ii)  dependence  and  comparison  results  as  a  function  of 
uncertain  data,  and  (iii)  suitability  of  these  algorithms  for  multiprocessing  computations 
via  domain  decomposition  methods.  Extension  of  these  algorithms  to  heterogeneous 
porous  media  fluid  flow  are  discussed. 


'The  calculations  were  done  at  the  North  Carolina  Supercompiiting  Center. 
^Supported  by  CNPq  and  Promon  Engenharia  Crom  Brazil. 
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1.  INTRODUCTION 

The  study  of  fluid  flow  in  porous  media  has  several  important  applications  in 
engineering  (Kaviany  1991  and  Nield  and  Bejan  1992).  Specific  examples  are  filters  for 
industrial  use,  or  separators  in  aerospace  fuels  (Kaviany  1991),  use  of  geothermal  energy 
(Rae  et  al.  1983  and  Kimura  1989,  1989a);  oil  recovery  (Bear  1972);  groundwater 
(Mark  1992, 1993  and  Clothier  et  al.  1981)  and  agriculture  (Feng  1993). 

Industrial  chemical  or  radioactive  effluents  are  sometimes  deposited  at  the  surface 
or  in  drums  that  are  buried  underground.  In  both  normal  operations  and  in  accidental 
conditions,  h  is  required  to  give  an  analysis  of  the  transport  of  the  contaimnants  throu^ 
the  soil  (Muralidhar  1990,  1993).  The  first  step  in  this  analysis  is  the  mathematical 
^ulation  of  fluid  flow  through  the  soil. 

Richards  (1931)  developed  an  equation  that  is  a  combination  of  the  continuity 
equation  and  Darcy's  law  (Philip  1969),  and  it  models  fluid  flow  in  a  porous  medium.  It  is 
a  nonlinear  parabolic  partial  differential  equation  which  contains  the  empirical  fimctions 
for  moisture  content  0(h)  and  hydraulic  coruiuctmiy  K(h). 

e(,K),  -  V  K{h)Vh  -  K{h\  =0  in  Qx(0,T)  (la) 

where  h  is  the  hydraulic  pressure  head,  z  is  the  vertical  direction  and  Q  is  the  space 
domain.  The  boundary  condition  of  the  third  Idnd  has  the  form 

\  K  {h)V  {h  +  z)j*n  =  giwn  in  Sx(0,T)  (lb) 

where  n  is  the  unit  outward  normal  to  Q  and  S  is  the  boundary  of  Q.  The  initial  condition 
is 

h  =  given  fort  =  0  in  Q.  (Ic) 
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in  general  the  equations  ( la-lc)  are  coupled  with  a  parabolic  system  of  equations  for  the 
transport  of  a  number  of  contaminants  through  the  soil  (Freeze  and  Cherr>  1979  and  Feng 
1993)  This  is  done  by  using  the  fluid  velocity  v  =  K(h)V(h  +  z)  which  is  computed  from 
the  above  system 

In  practice  the  empirical  functions  for  moisture  content  and  hydraulic  conductivity 
have  several  troublesome  properties.  First,  they  can  have  large  derivatives,  and  this  is 
often  the  case  for  hydraulic  conductivity.  Second,  they  are  not  precisely  known.  Third, 
they  can  have  strong  space  dependence  with  jump  discontinuities  resulting  from 
heterogeneous  porous  media.  The  objective  of  this  paper  is  to  give  an  approach  to  these 
problems  which  is  based  on  methods  used  for  the  Ste&n  problem  (Silva  Neto  and  White 
1993).  Particular  attention  will  be  given  to  the  first  problem  where  there  is  no  space 
dependence.  In  the  case  of  space  dependent  empirical  Auctions,  one  can  use  additional 
nodes  and  the  continuity  condition  on  the  fluid  velocity  (White  et  al.  1993)  to  generalize 
the  methods  of  this  paper. 

Traditional  methods  for  the  solution  of  (la-lc)  use  an  ^proximation  of  the 
empirical  data  by  esqunential  functions  (van  Gouchten  and  Nielsen  1985).  Then 
numerical  methods  such  as  Newton,  Picard,  or  Lees  implicit  foctored  method  can  be  used 
for  problems  without  large  derivatives  (Paniconi  et  al.  1991).  In  addition  to  addressing 
the  above  problems,  the  approach  of  this  paper  does  not  involve  expensive  function 
evaluations  and  does  eliminate  the  numerical  oscillations  associated  with  large  derivatives 
of  the  empirical  functions. 

In  section  two  we  present  the  general  approach  to  the  problem  which  is  adapted 
from  the  Stefan  problem.  Here  the  empirical  functions  are  approximated  by  piecewise 
linear  functions  which  reflect  the  field  data  (White  and  Broadbridge  1988).  The  partial 
differential  equation  is  discretized  by  the  finite  difference  method,  and  the  resulting 
nonlinear  system  is  solved  by  a  nonlinear  SOR  algorithm  (Cryer  1971)  which  is  described 
in  section  three. 


4 


Numerical  experiments  are  presented  in  sections  four,  five  and  six,  and  these 
experiments  were  chosen  to  demonstrate  the  feasibility  of  realistic  two  dimensional 
simulation  of  porous  flows.  Later,  we  indicate  how  one  can  extend  these  methods  to  three 
dimensional  and  heterogeneous  porous  flows.  We  show  agreement  of  the  numerical 
model  with  the  field  data  fi’om  Brindabella  silty  loam  soil.  Also,  we  show  how  one  can 
develop  comparison  results  vriiich  deal  with  the  uncertain  empirical  data.  High 
performance  computing  issues  are  described.  Here  we  demonstrate  that  tin  algorithm  in 
section  three  does  not  vectorize  well,  but  it  does  work  well  for  multiprocessors  when 
domain  decomposition  methods  are  used.  Finally,  we  state  our  conclusions  and  related 
woric. 


2,  DISCRETE  VERSION  OF  RICHARDS'  EQUATION 

In  this  section  we  state  the  finite  dtfierence  discretization  of  Richards'  equation  and 
make  some  comparisons  with  the  Stefian  problem.  If  in  equation  (la)  the  last  term  is 
eliminated,  and  h  were  to  represent  temperature  with  K  now  denoting  the  thennal 
conductivity  and  6  the  enthalpy,  then  this  would  be  the  enthalpy  formulation  of  the  Ste&n 
problem  (White  1985).  In  the  Ste&n  problem  the  K  and  6  have  jump  discontinuities  at  the 
phase  change  temperature.  SOR  methods  can  be  effectively  used  provided  the 
overrelaxation  is  not  2q)plied  during  a  cell's  phase  change. 

In  Richards'  equation  we  will  approximate  K  and  6  by  piecewise  linear  fimctions 
which  could  be  viewed  as  a  number  of  "linear  phases"  associated  with  the  nonlinear  flow. 
As  in  the  Stefan  problem  we  will  apply  the  SOR  method  provided  the  cell  is  not  changing 
"linear  phase."  Table  1  gives  some  data  for  Brindabella  loam  which  was  extrapolated 
fi-om  the  graphs  in  White  and  Broadbridge  (1988).  Note,  both  9(h)  and  K(h)  are 
monotone,  and  K(h)  has  large  derivatives. 


Table  1: 


Brindabella  Data 


m 

III 

■33m 

-8.0 

0.11 

0.0 

-1.2 

0.27 

0  000  118 

-0.8 

0.30 

0.000  327 

-0.7 

0.31 

0.000  457 

-0.6 

0.32 

0.000  664 

-0.5 

0.33 

0.001  138 

-0.4 

0.34 

0.001  693 

-0.3 

0.36 

0.002  326 

-0.2 

0.38 

0.004  568 

-0.1 

0.42 

0.011  117 

-0.0 

0.485 

0.118  000 

In  Silva  Neto  and  White  (1993)  two  "linear  phases"  were  used  and  were  coupled 
with  the  Kirchhoff  change  of  dependent  variable.  Ahhou^  this  was  a  cnide 
approximation  of  the  data,  it  did  track  the  wet>dry  inteiftoe  where  a  rapid  transition  from 
unsaturated  to  saturated  regions  occurs.  In  cases  i^diere  either  the  tran^on  is  not  rapid 
or  there  is  space  dependence  of  the  data,  the  Kirchhoff  transfr>rmation  is  not  applicable 
In  the  following  w  ?  jse  an  implidt  time  discretization  of  Richards'  equatimi. 

*  -  (*(*)*. ).  -  ),  -  K(*),  =  0  (2) 

where  h  is  known  from  the  previous  time  step.  Here  we  are  in  two  space  dimenaons,  and 
y  is  the  vertical  direction. 

Next  we  discretize  the  space  variable  by  the  finite  difference  method.  In  the 
calculations  that  we  later  discuss,  we  conader  a  two  dimension  flow  with  zero  flow 
through  the  sides  and  bottom,  and  nonzero  flow  through  the  top.  The  finite  difference 
grid  is  illustrated  in  Figure  1  where  the  nine  types  of  boundary  cells  are  indicated.  Here 
there  are  N  =  3  cells  in  each  direction  and  =  9  unknowns. 


(K(h)f^,-K(h))=R 


KK(h)hy+  K(h))=0 

Figure  1:  Finite  Difference  Grid 


Let  (ij)  denote  the  location  in  the  finite  difiference  grid.  Then  the  general  form  of 
the  finite  difference  equation  at  this  location  is 


If  (iJ)  is  an  interior  node,  then 


dr 


Ay 


0{k,)  1 


In  the  above  equation  we  used  the  convention  that  A',., , ,  ,,,  is  the  average  of  the 

hydraulic  conductivity  at  the  appropriate  surrounding  nodes  We  also  will  assume  that  the 
surrounding  nodes  are  evaluated  at  a  "previous  iteration"  value  Thus  equation  (3)  is  a 
piecewise  linear  system  as  illustrated  in  Figure  2  Both  the  piecewise  linear  approximation 
of  the  moisture  content  and  hydraulic  conductivity  functions  are  monotone,  and  so,  V 
must  also  be  monotone  nondecreasing.  Since  the  term  on  the  left  side  of  equation  (3)  has 
negative  dope,  equation  (3)  has  a  solution,  and  h  is  unique.  In  Figure  2  the  data  is 
deinaed  as  being  continuous,  but  this  need  not  be  the  case.  Even  if  Ffh)  has  a  jump  and 
remains  nondecreasing,  one  can  stiU  solve  for  a  unique  h  (Silva  Neto  and  White  1993). 


3.  NONLINEAR  SOR  ALGORITHM 

The  following  algorithm  has  evolved  from  the  work  by  Cryer  (1971)  for  set  valued 
systems  of  equations  that  may  come  from  models  of  the  Stefan  problem.  We  apply  a 
variation  to  the  system  given  in  (3).  The  following  variables  are  used; 

maxit  =  number  of  allowable  SOR  iterations  per  time  step, 

/?, ,  /!,=  number  of  cells  in  the  x  and  y  direction, 

=  overrelaxation  (larger  than  1 .0)  and  underrelaxation  (less  than  1 .0) . 


K 


Nonlinear  SOR  Algorithm 
for  k  =  Kmaxit 

for  i  =  l,/i^ 

forj=  l,n, 

compute  .  and  j  as  given  in  (3) 

solve  for  h  in  (3)  as  given  in  Figure  2 

if  h  and  h,  ^  are  in  the  same  linear  phase,  then 

dse 

h,j  =  (1  -  ©)h,j  +  ®  h 

endif 
end  loop  j 
end  loop  i 

test  for  convergence 
end  loop  k. 

In  the  computation  of  the  c  and  d  values  one  must  conader  the  nine  types  of  cells 
as  indicated  in  Figure  1.  For  more  complicated  geometric  configurations  and  for 
heterogeneous  porous  media,  this  will  be  more  complicated.  If  adjacent  cdls  have 
different  moisture  coment  and  hydraulic  conductivity  data,  then  one  must  insert  additional 
nodes  between  the  cells  and  demand  continuity  of  the  flow  velocity  at  the  inter&ce  (White 
etal.  1993). 

In  the  solve  step  one  must  determine  vdiich  linear  phase  the  solution  is  in,  aiKl  this 
is  done  by  partitioning  the  vertical  axis  as  indicated  in  Figure  2  by  die  dotted  lines  that  are 
parallel  to  the  line  given  by  d-ch.  Hence,  the  solve  step  has  a  loop  in  it  which  was  not 
indicated  above.  This  hidden  loop  contains  the  nonlinear  nature  of  the  solve  step. 


Moreo\  er.  if  there  is  a  large  number  of  linear  phases,  then  the  solve  step  will  become  more 
expensive  to  compute 

The  overrelaxation  is  used  to  reduce  the  number  of  outer  iterations,  and  the 
optimal  choice  Avili  vary  with  the  number  of  unknowns  The  underrelaxation  is  used  to 
avoid  numerical  oscillations,  and  this  works  well  for  choices  between  0.8  and  0  9  The 
numerical  oscillations  are  a  result  of  passing  from  one  linear  phase  to  the  next  linear  phase. 
This  deids  with  tlu:  large  derivatives  of  the  data  by  breaking  the  changes  in  slopes  into  a 
number  of  smaller  changes  in  slope.  We  found  this  to  be  much  mote  elective  than  the 
traditional  method  of  reducing  the  time  step. 

We  experimented  with  a  number  of  convergence  tests.  Finally,  we  imposed  two 
conditions; 

(i)  max|  new  h  -  old  h  |  ^  and 

(ii)  JJ  I  new  0  -  old  0\  ^ 

The  first  condition  is  aimed  at  possible  convergence  of  the  pressure  at  each  node.  The 
second  condition  reflects  posable  convergence  of  the  total  moisture,  and  it  is  more  of  a 
global  test  than  the  first  condition. 

4.  COMPUTATIONS  FOR  BRINDABELLA  LOAM 

The  purpose  of  these  computations  is  to  see  if  our  model  of  Richards'  equation  will 
accurately  track  the  movemoit  of  moisture  tiirough  Brindabella  loam.  We  compare  our 
calculations  with  the  observations  in  White  and  Broadbridge  (1988).  In  our  numerical 
model  we  considered  a  0.3[m]  x  0.3[m]  region  with  boundary  conditions  as  indicated  in 
Figure  1 .  In  the  top  boundary  we  used  R  =  0.016S[m/hr],  and  the  initial  pressure  was  set 
as  h  -  -8.0[m].  The  moisture  contem  fimction  was  a  linear  interpolation  of  the  data  in 
Table  1.  The  hydraulic  conductivity  data  indicates  a  very  increasing  and  concave  up 
function;  consequently,  linear  imerpolation  of  the  data  would  generate  large  errors.  In  the 


1(1 


calculations  presented  in  Figure  3a  we  used  a  linear  interpolation  of  the  modified  data  for 
hydraulic  conductivity  in  Table  1,  we  reduced  the  interior  values  by  50  percent  and  kept 
the  two  end  values  at  0  0  and  0  118 

In  our  computations  we  set  the  following  parameters  at 
At  =  0  125[hr]  Ax  =  0.015[m] 

N  =  20[cells  in  each  dir  ]  R  =  0.0165{m/hr] 

lii  =0.9  tn  =1.4 

ej  =  10-^  62  =  lO"*. 

Convergence  was  usually  attained  in  20<40  iterations.  If  no  underrelaxation  was  used, 
then  numerical  oscillations  would  occur  dx)ut  once  every  20  time  steps. 

During  the  initial  times,  the  l^drauiic  conductivity  is  small,  and  Richards'  equation 
is  dominated  by  wave  like  properties.  As  time  progresses  the  hydraulic  conductivity 
increases  so  that  Richards'  equation  is  dominated  by  difihision.  At  time  6.0[hr]  the  steady 
state  solution  has  essentially  been  reached.  At  the  bottom  (y  =  300[mm])  the  porous 
media  is  at  saturation  (0  =0.485).  At  the  top  (y=0[mm])  the  porous  media  has  pressure 
such  that  K(h)  =  R  (8(h)  =0.426).  At  this  time  the  diffiision  force  is  equal  to  the 
gravitational  force;  hence,  no  more  moisture  can  enter  the  porous  media. 


II 
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Figure  3a:  Computed  Moisture  for  Variable  Times 
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The  observed  moistures  are  indicated  by  discrete  poims  in  Figure  3b.  These  were 
for  the  times  of  1.0  (diamonds),  2.25  (triai^es)  and  5.0  hours  (squares).  The  computed 
values  give  good  agreement  with  the  observed  values.  The  computed  moisture  comem 
curves  are  somewhat  more  smoothed  than  the  observed  moisture  comem  data.  This  may 
be  attributed  to  large  hydraulic  conductivity  data;  if  one  reduces  the  hydraulic 
conductivity  for  smaller  pressures,  then  a  sharp  from  can  be  calculated  to  match  the 
observed  moisture  comem  data. 
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Figure  3b:  Observed  Moisture  Data  for  Variable  Times 


5.  COMPUTATIONS  WITH  UNCERTAIN  DATA 

This  section  contains  an  analysis  of  the  moisture  as  a  fiinction  of  the  empirical 
data  of  the  moisture  content  and  iQrdraulic  conductivity  functions.  In  practice  much  of  this 
data  is  not  precisely  known,  and  therefore,  the  effects  upon  the  computations  from  any 
model  will  have  some  uncertain  aspects.  In  oai  numerical  mqperiments  we  decreased  K 
and  the  computed  moisture  at  the  top  increased.  We  also  increased  6  and  the  computed 
moisture  at  the  top  increased.  In  the  computations  indicated  in  Figure  4,  for  time  equal  to 
1.2S[hr],  we  decreased  K  and  increased  0,  and  the  hugest  computed  moisture  .^ment  at 
the  top  was  the  result.  Here  we  kept  the  data  at  the  end  points  of  Table  1  fixed  and  varied 
the  interior  data  by  increments  of  20  percent.  In  all  computations  the  computed  moisture 
content  at  the  top  increased  while  the  computed  moisture  content  at  the  bottom 
decreased.  This  happens  because  the  sides  and  bottom  do  not  permit  flow  through  them, 
and  the  total  moisture  must  remain  constant. 
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Figure  4:  Moistuie  end  Vuriuble  DiU 

In  order  to  gain  some  insight  on  this,  it  is  instnictive  to  examine  the  finite 
difference  equation  at  the  top  r^on.  In  the  case  of  the  top  center  nodes,  d  has  tlK  form 

d  =  —  -  +  d  where  d  has  form  amilar  to  that  given  in  (3). 

Ay  Ay 

Figure  5  shows  that  if  F  increases,  then  the  solution  of  (3)  will  decrease.  Also,  if  d 
decreases,  then  the  solution  of  (3)  will  decrease.  Therefore,  if  both  F  increases  and  d 
decreases,  then  the  solution  of  (3)  will  decrease.  If  K  decreases,  then  d  will  increase  and 
F  will  decrease.  However,  if  both  K  decreases,  and  0  increase*:  enough,  then  F  will 
increase.  For  our  choices  of  At  and  Ay  this  is  the  case.  Of  course,  this  is  just  an  analysts 
at  one  grid  point,  and  the  argument  requires  much  more  careful  discussion. 
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6,  COMPUTATIONS  USING  MULTIPROCESSORS 

In  the  computations  reported  in  this  section  we  tried  to  implement  the  above 
algorithm  on  a  single  CPU  with  vectorization  on  a  Cray  Y>MP,  the  AUiant  FX-40  with 
two  vectorized  CPUs,  and  the  Kendall  Square  Research  KSRl  with  up  to  16  CPUs  and  ik> 
vectorization.  In  the  calculations  in  Silva  Neto  and  White  (1993)  the  vectorization 
methods  did  not  seem  to  work  well.  These  attempts  involved  reordering  the  nodes  by  the 
red-black  order  (checker  board  order).  This  method  also  did  ix>t  work  wdl  for  our 
current  problem.  The  reason  for  this  is  that  the  iimer  most  loop  has  computations  ^ch 
are  too  complicated  to  efifectively  be  done  on  a  vector  pipeline. 

The  multiprocessing  approach  with  domain  decomposition  reordering  (White 
1987,  or  Ortega  1988)  was  much  more  promiring.  This  reordering  is  depicted  in  Figure  5 
where  L  ^  4  (the  number  of  larger  blocks  of  nodes)  and  the  classical  order  of  the  blocks  of 
grid  points  is 

P  P  P  P  P  P  P 

The  domain  decomposition  order  lists  all  the  smaller  interior  boimdaiy  blocks  first  (evoi 
number  blocks  in  Figure  6)  and  is 
P  P  P  P  P  P  P 
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The  idea  behind  this  reordering  is  to  take  advantage  of  the  5-point  finite  difference  pattern 
Once  the  calculations  in  the  even  blocks  have  been  done,  then  the  calculations  in  the  large 
odd  blocks  are  independent  of  one  another 


Figure  6:  Domain  Decomposition  Order 


for  k  =  l.maxit 

concurrently  do  SOR  over  the  even  blocks 
update 

concurrently  do  SOR  over  the  odd  blocks 
test  for  convergence 
end  loop  k. 

In  our  calculations  we  used  the  Kendall  Square  Research  multiprocessing 
computer,  KSRl,  which  is  operated  by  the  North  Carolina  Supercomputing  Center.  The 
KSRl  multiprocessing  computer  has  three  parallel  constructs  that  can  be  used  in 
FORTRAN  code:  tile,  parallel  section  and  parallel  region.  Tile  is  used  to  partition  loops 
and  is  very  effective  for  simple  computations  such  as  matrix  multiplications.  Parallel 
section  can  be  used  to  concurrently  execute  difierent  code  segments.  We  used  parallel 
region  which  duplicates  a  code  segment  and  uses  different  data  streams.  In  our 
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compulations  we  controlled  the  number  of  processors  by  using  a  leant  of  processors  that 
are  assigned  at  the  beginning  of  the  code  and  are  used  to  reduce  parallel  overhead 

Table  2  shows  the  speedup  and  efficiency  for  a  variety  of  L  (the  number  of  large 
blocks)  and  N  (the  number  of  cells  in  each  direction)  These  quantities  are  defined  as 
follows. 

Sl  =  (CPU  time  using  one  block)/(CPU  time  using  L  large  blocks)  and 
El  =  Sl/L. 

In  the  first  four  rows  N  varies  and  L  is  fixed.  We  see  increased  speedup  and  efficiency  as 
N  increases.  This  is  a  resuh  of  decreased  parallel  overhead.  In  the  last  four  rows  N  is 
fixed,  and  L  is  increased.  Here  the  speedup  increases,  but  the  efficient^  decreases.  Of 
course,  if  there  are  many  larger  blocks  (L)  and  the  number  of  cells  in  each  direction  (N) 
remains  the  same,  then  the  rdalive  size  of  the  larger  block  to  the  smaller  block  decreases. 
This  partially  accounts  for  the  decreased  efficiency.  These  calculations  did  not  attempt  to 
make  the  most  efficient  use  of  the  FORTRAN  language,  or  the  most  efficient  use  of  the 
KSRl  computer's  architecture. 

Table  2:  Speedup  and  Efficiency 


N 

L 

Sl 

Ei. 

20 

2 

1.56 

0.78 

40 

2 

1.68 

0.84 

80 

2 

1.75 

0.88 

160 

2 

1.78 

0.89 

160 

4 

3.16 

0.79 

160 

8 

5.09 

0.64 

160 

16 

7.29 

0.46 
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7.  CONCLUSIONS 

Richards'  equation  was  approximated  by  the  finite  difference  method,  and  the 
empirical  data  for  the  moisture  content  and  the  hydraulic  conductivity  were  approximated 
by  piecewise  linear  functions.  The  resulting  nonlinear  algebraic  system  was  solved  by  a 
variation  of  the  nonlinear  SOR  iterative  method.  Good  convergence  properties  were 
observed  for  three  types  of  calculations  which  were  chosen  to  demonstrate  the  feasibility 
of  realistic  numerical  simulations  using  this  method.  These  included  an  accurate 
simulation  of  fluid  flow  in  Brindabdla  loam,  a  senativity  analysis  of  the  computed  solution 
upon  the  empirical  data,  and  the  use  of  muitiiMX>cessing  computers  via  domain 
decomposition  methods. 

In  the  above  calculations  the  empirical  data  did  not  have  a  space  dependence. 
However,  in  White  et  al.  (1993)  we  illustrated  for  a  steady  state  and  one  space  dimension 
version  of  Richards'  equation  that  the  compact  volume  method,  in  place  of  the  finite 
difference  method,  could  be  effectively  used  for  such  heterogeneous  problems.  The 
compact  volume  method  can  be  viewed  as  an  enhanced  finite  difference  method  where 
additional  nodes  are  inserted  at  the  cell  interface  and  additional  equations  are  generate  by 
requiring  continuity  of  the  fluid  velocity  at  these  interfiles.  This  may  be  done  for  all  cell 
interfaces  or  for  just  those  cells  where  the  empirical  functions  change  with  respect  to  the 
space  variable.  We  expect  the  methods  of  this  paper  to  generalize  via  the  compact  volume 
method  to  the  more  complicated  heterogeneous  case. 
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I  Abstract 

We  consider  balk  the  compact finite  volume  and  finite  difference  space  discretisations  of  the  St^an  problem. 
The  resulting  algebraic  systems  are  solved  by  nonlinear  versions  of  ADI  and  SOR.  Both  algorithms  contain 
i  significant  parallelism  which  is  demonstrated  on  two  vectorlmuliiprocessing  computers,  the  Alliani  FXI40  and 
the  Cray  Y-MP.  Numerical  experiments  indicate  that  the  compact  discretisation  and  ADI  give  the  best  accuracy 
I  vdth  the  minimum  computational  cost. 

I 

Introduction 

IThis  report  outlines  some  new  numerical  methods  for  the  Stefan  Number  ■  1 » (u  -  u^ 


■This  report  outlines  some  new  numerical  methods  for  the 
solution  of  multiple  space  dimension  Stefan  problem.  This 
will  include  both  SOR  and  ADI  algorithms  for  the  nonlinear 

I  algebraic  systons  which  result  from  both  the  FDM  and 
compact  space  discretizations  methos.  Analysis  of 
convergence  will  be  discused. 

I  The  ADI  algorithms  have  been  introduced  because  they 
appear,  in  many  examples,  to  be  more  effective  than  some  of 
the  traditional  algorithms  related  to  the  Stefan  problem. 
^Moreover,  these  ADI  algorithms,  as  in  the  linear  parabt^ic 
■problems,  have  large  number  of  independent  Iridiagonal 
■solvers.  Therefore  they  can  be  effectively  implemented  on 
vector/multiprocessing  computers.  This  is  illustrated  for 
■  Alliant  FX/40  and  the  Cray  Y-MP. 

^In  this  paper,  we  will  use  the  enthalpy  fonnulation  of  the 
Stefan  problem.  (See  Rose)^  Elliot^  ^  Whitc^. 

|l.  E,-Ap(£)  =  /on  Dx(0,7) 

E(x,0)  =  given  on  D 

■  P(£(jc,r))=  given  on  dDx{0,T) 

*where 

IP(£)=Kirchoff  transformed  temperature 

P(£) = u  is  the  “inverse”  of  the  enthalpy  function 
EsH(u). 

jjmiypical  values  are  (See  Williams  and  Wilson^) 

u=270 

r 

■i  » I.OE+2 


a^ s  I.(^-t-3  (Degrees  in  Kelvin) 


rTMi  nsearch  was  sponsorad  by  tha  Air  Forca  OtBea  of  SdaniSc 
^Rasaareh,  Bating  Ak  Forca  Baaa,  D  C  Comraet  No.  F49620-B9  C- 
OOtO  and  Army  Rasaareh  ONee,  Rasaareh  Triangia  Park,  N  C  Con- 

tCtNo.DAAL03-90-Q-0126. 

NCSMalMvmsily.RaMgh.NC 
IKAtT State  Univaraily,  Qraanaboro,  NC 


Fbr  these  typical  values  and  the  range  of  u,  it  is  important  to 
note  that 

P(£)/E  is  Lipschitz  continuous  and 
(P(£))/££  M  where 

m«p  (27000))/27000  -  270/27000=  l.OE-2 

M»p  (27030))/27030  -  285/27030=  1.054E-2 

The  difference  M-m  =  5.4E-4  is  relatively  small  and  will  be 
important  in  a  mild  constraint  on  the  time  interval  >  A  (. 

Problem  (I)  may  be  discretized  in  the  time  variable  either 
explicitly  (see  Athcy')  or  implicitly  (see  White^).  In  the 
former,  this  is  a  stability  condition  on  A  r,  but  there  is  not  a 
nonlinear  syuem  to  solve.  The  implicit  time  discretization 
yidds  no  stability  constraint  on  A  T,  but  it  does  give  a 
nonlinear  system  to  solve.  In  White^  and  Elliot  nonlinear 
SCMl  methods  have  been  used  to  approximate  this  system 
which  has  the  form 

1  E+Ai  A  P  (£)  =  </ 

Where  A  is  an  M-nmlrix  associated  with  -A. 

Later  in  Whited  a  modified  version  of  (2)  was  studied  so  that 
more  traditional  linear  solvers  could  be  used  in  the  solution 
(2).  The  component  from  of  (2)  is 

n 

7-  »■ 


£i+  Ar 


LetA(E)> 


[a:  i  and  consider  the 


and  consider  the  vector  form  of  (3) 


E  +  AM(£)£  =  d 
4.  /+  AtA{E))Em  d 
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Tot  A  t  siMiicwltjii  icsiriclcd  iiiul  llic  jUh)vc  lypiciit  values, 
I^AM(^)  will  be  slriclly  diagonally  doiiiinaiK,  aiul 
llicrcliHC,  nuiisiiigiilai.  I  bis  allows  us  lo  doliuc  llic  lollowiiig 
algoridiin: 

5-  (/  + A/ /»(£”))■'  f/ 

The  convcrgciKC  of  (5)  was  csiablislicd  by  sliowiiig  Tor 
suitably  small  At  Uiai  tlic  map  was 

G(E)  •  {l  +  Aiy^(Er))-'d 

coniracUve.  However,  numerical  experiments  on  a  one 
variable  Stefan  problem  showed  Ural  At  constraint  note  to  be 
too  severe. 

This  study  extends  die  solution  o(^'^  and  to  higher 
dimenskms  by  consUkriog  AOI  splitting  associated  wiUi 
A(E). 

ADI:  One  II  and  V  sweep  per  time  step 

Let  A  =  H  V,  wlierc  H  and  V  arc  associated  witli  Itorixoiital 
grid  rows  and  the  vertical  grid  columns  of  -  A.  Tlicn  we  define 

77(E)  ■//(£)  = 

V(£)  -  V(E)  = 

0(E)  =  ^  +  ArV(£) 

Note  I  -I-  A/  A(E)  =  fi{E)+^{E),  and  for  mikl  constraint  on 
At ,  /7(E)  and^(E)  arc  nonsingular  and  triadiagonal. 


8.1 

(a/  >  /)(C*^  =  (a/  - 

R.2 

(a/+  (>(E** '))£**'=  (o/-  />(E**'^2))E**'^' 


Both  (8.1)  ami  (8.2)  can  be  solved  as  in  (5).  For  example, 
consider  (R.l)  with  k  +  1/2  suppressed  and  m  equals  the 
iteration  index. 


Coiiskicr 
9  (0/+  fi(E)d 


/>(E-')j  d 

Proposition  1: 

Let  A,  A(E).  h  (E),  ^  (E)  be  as  above.  If  A  is  an  M-matrix 
and  |)(£)  is  as  dcHncd  above,  tlicn  A(E).  /)  (E),  ^  (E)  are 
non-singular  for  small  A/.  Moreover,  tlicrc  exists  >  0  such 
tliat  for  a  ^  converges  to  tlie  unique  solution  of 


10  (a/ +  />(£*•)) 


The  proof  follows  tliat  given  in  Whitc^*^  for^^^  and 


In  practice  a  is  u.sed  as  an  acceleration  parameter  and  usually 
only  3-5  iterations  arc  required  lo  solve  (8.1)  or  (8.2).  The 
sclicmcs  in  (8. 1 )  and  (8.2)  do  not  solve  die  implicit  Ume  step: 


11.  +  A(E*'^*)E**‘ 


yt+l 


For  each  fixed  problem  in^'*^  there  arc  a  mimbcr  of  ADI 
schemes  with  mulUple  H  and  V  sweeps.  We  focus  on  only 
one  in  the  next  section. 


This  suggests  die  following  nonlinear  verion  of  the  Peaceman 
-  Ranchford  ADI  algorithm  for^'^ 

6.1 

^^A^2  ^ *  77(£*^''^)E**‘'^=/*''^-  V(E*) 

6.2 

^♦l_  !/(£*+')£**'  =  /+'-  77 (E**''^) 

At/2 

At  each  Ume  step,  k,  (6. 1 )  and  (6.2)  gives  a  nonlinear  system 
to  solve: 

7.1  /7(E*^'''^)E**'''^=  d***''^-  ^(E*)E* 

7.2  ^(£*+')E*+'=  />(E*+'''^)E**''^ 

It  is  useful  to  incorporate  die  ADI  acceleration  parameter  a 
which  will  also  alow  as  to  avoid  any  fuitlicr  constraint  on  At: 

A(E)=  (a/+  />(E))-  (a/-  <>(£)) 

Then  (7.1)  and  (7.2)  give 


ADI:  Mulliple  II  and  V  sweeps  per  time  stop 

Consider  die  simple  nonlincrar  syslem^^^  with 

A(E)E=  77(E)+  V'(E),or 
12.  (/>(£)+  (7(E)) E=  d 

After  linearixation  the  multiple  II  and  V  sweep  ADI 
algorithm  is 

13.1 

(a /  +  /> (£"«)) d+  (a/-  ^(£^))£^ 

13.2 

(o/+0(£^))£”^*=  d+  (a/-  /)(E^))£"^‘^ 

One  can  solve  (13.1)  and  insert  it  into  (13.2)  to  obtain 

14.  E”'^'=  8(£^)+ //(E^)£" 

If  E"*'  converges,  define  £”•  *  ' 

provided  die  siiectral  radius  of  11  (£” •)  is  less  dian  1 .  Hits  is 

15.  E^^  ‘  =  (/  -  //  (E^))~'  5  (£^) 
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me  (15)  forms  llic  oulcr  itcruiion  of  our  uigoriiliin  lo  solve 

(2). 

rniprtsilion  2 

Consider  (12)  and  the  algorithms  (13)  -  (15).  Let  the  Siiine 
assumptions  hold  as  in  proposition  I.  Ttien  tJierc  exists  a 
o„>  0  such  tliat  if  a  S  a^,tlie  inner  iterations  (14)  converge 

to  E"*"  *  '  in  (15),  and  E"^*  *  converges  to  the  solution  in 

(12). 

The  proof  requires  the  spectral  radius  of  11(E)  to  be  unifonnly 
bounded  below  1.0.  Tliis  step  makes  use  of  nionotonicity 
pt..pertics  (nonnegativc  matrices),  and  it  contrasts  wiUi  the 
linear  case  where  symmetric  positive  dcnnilc  matrices  are 
used. 

The  above  schemes  is  not  the  only  way  to  do  multiple  H  and 
V  sweeps,  but  it  does  allow  one  to  give  a  convergence 
analysis.  The  best  version  of  multiple  H  and  V  sweeps  is  not 
yet  clear.  Hie  numerical  experiments  diat  we  have  done 
indicate  that  llie  Pcaceman-Ranchford  method  (one  H  and  V 
sweep  per  time  step)  is  adequate. 

Numerical  Experiments 
In  this  section  wc  consider  the  numerical  solution  of 
E,-  A3(E)=  0 

where  die  domain  D  =  (0,1)  x  (0, 1)  x  (0,l/>/2) 
and 

[-1.  t-^(x  +  y)<0 

The  space  variables  are  discretized  in  two  ways,  tlic  compact 
finite  volume  (CPT)  (See  Rose^),  and  the  traditional 
five-point  Finite  difference  mcdiod  (FDM).  The  resulting 
algebraic  problems  sire  solved  by  ADI  (one  II  and  V  sweep 
per  time  step)  and  by  SOR.  Tlic  computations  were  done  on 
two  vcctor/muUiprocessing  computers:  the  Alliant  FX/40 
and  die  Cray  Y-MP.  Tlie  Alliant  FX/40  lus  two  processors 
each  with  a  vector  pipeline.  Tlie  Cray  Y-MP  was  used  as  a 
single  processor  with  a  more  soFisticaicd  vector  pipeline  and 
muh  shorter  clock  cycle  time  (4  nscc  as  compared  to  170 
nsec). 

The  CPT  space  discretization  has  a  great  deal  of  parallelism 
which  can  be  executed  on  either  vector  pipelines  or 
multiprocessing  architectures.  Tlie  unknowns  at  the  center  of 
the  cells  maybe  grouped  together,  and  the  unknowns  at  die 
boundary  of  the  cells  grouped  together.  Tlie  resulting 
coefficient  matrix  has  die  form  of  a  two  coloring  scheme. 
Hence  one  can  execute  the  SOR  scheme  for  the  CPT 
discretization  on  vector  pipelines.  Also  the  CPT  space 
discretization  when  solved  by  ADI  methods  will  have  a  large 
number  of  independent  tridiagonal  solvers.  Thus,  the  vector 
version  of  die  iridiagonal  algorithm  may  be  used. 


In  tlic  case  of  FDM  space  discrciizaiion,  the  uaditionul  red- 
black  ordering  of  nodes  allows  vcclorizaiion  of  die  SOR 
algorithiii.  Also,  domain-decomposition  mcdiods  have  been 
used  on  SOR  so  diat  the  two  processors  of  the  Alliant  FX/4U 
can  be  effectively  used.  Of  course,  wc  can  use  the  vector 
version  of  tlic  tridiagonal  algorithm  to  solve  tlic  FDM  wlicn 
ADI  is  used. 

Tabic  I  CPT-ADI 


Cells 

162 

322 

642 

Computers 

Alliant,  serial 

2.01 

15.64 

122.28 

Alliant,  vector 

1.02 

5.98 

38.63 

Alliant,  vector,  2  processors 

0.57 

3.32 

21.07 

Cray,  serial 

0.129 

0.89 

6.52 

Cray,  vector 

0.062 

0.32 

1.76 

Table  2  CPT-SOR 

Cells 

162 

322 

642 

Computers 

Alliant,  serial 

4.68 

74.44 

1155.00 

Alliant,  vector 

2.67 

30.72 

508.81 

Alliant.  vector,  2  processors 

1.41 

15.99 

270.91 

Cray,  serial 

0.41 

6.23 

96.24 

Cray,  vector 

0.11 

1.06 

11.34 

Table  I  contains  computing  time  done  for  die  compact 
discretization  and  using  one  sweep  ADI  algorithm.  Both 
computers  were  effective  vcctorizcrs.  Table  2  contains 
computing  dmes  done  for  die  compact  discretization  with 
red-black  ordering  SOR  algorithm  where  ci>=1.2 


Table3FDM-SOR 


Cells 

162 

322 

642 

Computers 

Alliant,  serial 

0.82 

12.93 

186.72 

Alliant,  vector 

0.89 

10.29 

124.97 

Alliant,  vector,  2  processors 

0.51 

5.58 

67.35 

Cray,  serial 

0.069 

1.05 

,  15.48 

Cray,  vector 

0.043 

0.42 

4.19 
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The  iJiiid  table  contains  computing  times  for  Uic  FDM  witii 
red-  black  ordering  for  SOR  where  (u=1.2.  This  method 
requires  more  iterations  to  reach  convergence  tlian  docs  llic 
CPT-  SOR  mctliod. 

All  three  methods  give  good  accuracy.  Tlic  CPT-ADI  seems 
to  be  more  accurate  and  less  computing  lime  required  for  this 
and  related  examples. 
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Typical  values  ate  (See  Williams  and  Wilson  PJ) 

Uf-270 
a,  -  1.0E-b2 

ag  *  1  .OE-f  3  (Degrees  in  Kelvin) 

L-15 

Stefan  Number  ••  1 « (u  •  Uf)/L 

For  these  typical  values  and  the  range  of  u,  it  is  important  to  note  that 
(P^)/E  is  Upschitz  continuous  and 
0  ^  m  £  (P(E))/E  £  M  where 
m  -  O(27000)y27000  -  270/27000  -  l.OE-2 
M  -  (p(27030))/27030  -  285/27030  *  1.054E-2 
The  difference  M  -  m  ■  5.4E-4  is  reladvely  small  and  will  be  important  in  a  mild  constraint 

on  the  time  interval  >  At. 
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Abstract 

Recently  developed  Compact  Finite  Difference  scheme 
(CPT)  is  applied  to  two  dimensional  diffusion  equations. 
The  relative  merits  cf  CPT -ADI  are  investigaud  with  other 
computational  schemes  such  as  finite  difference  method  - 
ADI(FDM-ADIlandFDM-SOR.  The  numerical  results  oth 
tainedfrom  these  three  approaches  are  compared  to  known 
analytical  solutions.  The  primary  interest  <4  this  study  lies 
on  vectormttion  and  parallel  processing.  According  to  our 
results  shown  in  tables  1  CPT-ADI  is  found  to  be  superior 
scheme  with  regards  to  accuracy,  than  both  FDM-ADI  and 
FDM-SOR.  It  is  also  fastest  algorithm  than  both  FDM-ADI 
and  FDM-SOR  as  it  is  evident  from  CPU  times. 


with  center  points  Xj.  jsl/23/2,...,  M  -  1/2  teid  interior 
endpoints  Xj.  M  -1.  We  shall  adopt  die  finite 

difference  ntnations  ^  ^  ^  .f  1/2  *  ^  -i/2‘  h]  *  and 
tt(i)aU|^  We  also  denote: 

Ig  « { 1/2, 3/2.....  M  •  1/2)  (center  points) 
s  { 1.2,...,  M  •  1 )  (interior  enc^ints) 


1.  Introduction 

We  shall  describe  briefly  the  ctnqiact  finite  difference 
method  (CPT)  for  one  dimmensional  steady  state  problem. 
The  extension  to  2-0  problem  may  be  easily  d^.  The 
underlying  {diysics  behind  this  iqtproadi  lies  in  solving  the 
differential  equation  in  isolation  from  its  neighboring 
subintervals  (i.e.  compactly).  Then,  extend  the  solution  in 
the  large  by  means  of  continuity  conditions  for  tbe  flux  and 
tenqterature  accross  tbe  boundaries  of  the  contiguous 
subintervals  (See  Rose  [1]). 

We  consider  here  one  dimensional  steady  diffusion 


problem: 

(1) 

D.E. 

u'  =  f 

(2) 

r=u 

(3) 

B.C. 

forXeIandIs[I_,I>t-] 


Divide  tbe  interval  I  into  m  nonoverlaping  subintervals: 

*  This  research  was  sponsored  by  the  Air  Force  Office  of 
Scieatiiic  Research,  Bolling  Air  Force  Base,  D.  C.  Contract  No. 
F49620-89-C-0010  and  Army  Research  Office,  Research  Triangle 
Fluk.  N.  C.  Contract  No.  DAAL03-90-Ghl26. 


N - H 

I - • - 1 - • - 1 

♦j.i 

Fig.  1. 

The  discrete  equations  for  (1)  and  (2)  can  be  written  as.  for 


jelc. 

(4) 

Uj.^i-Uj.i  =  Axjfj 

2  2 

(5a) 

2  2 

and 

(5b) 

2  ^  2 

Nett,  it  is  required  that  both  u  and  f  be  continuous  across 
every  endpoint  common  to  two  intervals: 

Mi  =[4=0  for  i€  le. 


0-aia6>266S4/92  $03.00  C 1992  IEEE 
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Using  ihis  conunuity  condiiion.  |u]i  =  0  in  terms  of  (fi  we 
have 

0i  ■  ^*1  -  -L  *1*1  +  —  ■ 

_ 2  -  2 

h,.l  '  h.^i 
2  2 


and  in  the  case  of  equal  intervals,  which  reduces  to 


<>i+i  +  <>i-l 

«t>i  =  — - 2.,  i  €  le  (endpoints) 

£0 


Now  from  (4)  and  (3)  we  get, 

aa)  <>j+l-2^j  +  ^j.i  =  2hjfj,  i€  Ic 

2  2 

and  from  (6) 

(7b)  ^+1-2^)  + ^.i  =  0,  iele 

2  2 


The  equations  (7)  and  the  boundary  conditions  lead  to 
determined  system  of  algebraic  equations  for  the  values  of 
These  equations  lead  to  a  iridiagonal  system  of  equations 
which  is.  therefore,  solved  by  Thomas  algorithm. 

The  extension  to  two  dimensional  scheme  may  be  easily 
done.  Two  dimensional  stencil  is  shown  in  Fig.  2. 


n  ♦  ^ 

U  2  -  u" 
>■] 


U  I 

— =F.  2  +  Gr, 


1  n  + 


where  and  Gy  are  standard  finite  differeiKx  expressions 

for  u^  and  Uyy  respectively  and  t  s  At/2.  Using  the 

standard  fmite  difference  scheme  we  get  the  following 
equations 

/Q\  n  +  '^  n+i  a'4"^ 

(9)  -u.  2.  +  (a  +  2)  u  2  -  u  2  = 

>-i.j  I.J  i+r.j 


u!:j-i+(a-2)uPj-uPj,,, 

2 

i.j  =  1.3 . 2M-1,  a  =  ii^.  Ax  =  Ay 

At 

(10)  -uj  t  \  +  (a  +  2)  u{|  J-  ^  ,  = 

u.  2.  +(a-2)u .  .  2  -  u.  2. 

i-i.j  i,j  i+itj 

i  J  =  13>  ••••  2  M  - 1 

and  the  continuity  conditions  are 


/Oat  B  +  1  _n  +  J-  n+J-  „ 

-U  2  +  2  U  2  -  U  2  =  0 
i-l,j  i,j  1+1. j 

i,j  =  2, 4, 2M-2 

(loa)  ■ui:|.\+2ui;|»-u[;t;,=o 

i0*2,4 . 2M.2 

The  two  algebraic  systems  (9).  (9a),  and  (10),  (10a)  can 
be  solved  by  the  tridiagonal  algorithm. 


2.  Numerical  experiments 


Fig.  2. 

Consider  the  two  dimensional  diffusion  equation  u, «  u„ 
•f  Uyy.  The  compact  ADI  method  is  given  by  equations  (8) 
•(10): 


One  of  the  primary  objectives  of  this  projea  is  to 
vectorization  of  1-D  and  2-D  diffusion  problems.  We 
vectorize  the  2-D  heat  equation  for  the  Cray  Y-MP  using 
red-bltudc  SOR  algorithm. 

We  consider  one  analytical  solutions  for  the  following 
diffusion  equation  and  conqnre  them  with  reflective 
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numencal  solutions.  The  resulLs  are  recorded  in  tables  1 . 

The  two  dimensional  diffusion  equation  is  Uj  =  Uj^j^  +  u^y 

+  f(x,y,t).  0  <  x<  1.0  <y  <1.  t  >  0.  The  following  analylical 
solutions  are  considered: 

u(x.y.t)=  lOOxd  -x)y(l-y)e*^andf(x,y,t)  =  200(y-y^  + 
x-x^)  -  100(x-x^)(y-y^e'* 

All  the  problems  mentioned  above  are  associated  with 
Dirichlet  boundary  conditions. 

The  OPT-ADI  is  compared  with  FDM-ADI  and 
FDM-SOR.  The  functional  errors  in  the  method  is  0(h2)  as 
expected  and  derivative  error  is  of  0(h).  The  result  is  shown 
in  table  1. 


FDM-ADI  At  =  Ax2 


No.  of 
Hxahy  cells 

No.  of 
iters 

CPU  Time 
(sec) 

M«Fct 

Error 

Max  Derivative 
Error 

16 

128 

0.76 

4A924E-03 

1.01704E-02 

32 

512 

6.15 

1.1469E-03 

5J24588E-02 

64 

2048 

49.48 

2.8667E-04 

2.66432E-02 

3.  Concluding  remarks 

The  X.  y  domains  are  devided  into  8.  16.  32  and  64  cells 
and  in  each  case,  the  starting  time  is  taken  to  be  0  and  the 
fmal  time  is  1.  The  relation  between  time  step  and  space 
step  varies  among  FDM-ADI.  FDM-SOR  and  CPT-ADI 
schemes.  We  assume  all  the  material  constants  are  equal  to 
unity  and  Ax  =  Ay.  In  case  of  FDM-SOR  and  CPT-ADI.  Ax 

*  Ay  =  At  and  however  in  case  of  FDM-ADI.  At  =  (Ax)^. 
This  is  the  disadvanuge  for  FDM-ADI.  For  example,  when 
Ax  =  1.S62SE-2,  At  =  4.883E-4  and  it  would  take  2048 
iterations  to  reach  time  1.  However,  CPT-ADI  and 
FDM-SOR  are  free  from  this  difficulty.  But  SCKl  also  has  a 
difierent  disadvantage.  Each  time  step,  FDM-SOR  takes 
large  number  of  iterations  to  converge  to  a  [seassigned 
level.  With  64  cells  CPT-ADI  requires  only  64  iterations  to 
reduce  the  error  level  to  334792E-03,  where  as  FDM-ADI 
requires  2048  iterations  to  reduce  the  error  level  to 
2.8667  lE-04  and  FDM-SOR  requires  3,392  iterations  to 
reduce  the  error  level  to  9.57374E-04. 

As  CPT  algorithm  requites  mote  points  evaluation  per 
iteration,  it  is  obvious  that  CPT  requires  more  CPU  time 
than  FDM-SOR  or  FDM-ADI. 

Acknowledgments:  Thanks  to  M.  E.  Rose  for  his  many 
valuable  suggestions  for  the  completion  of  this  ptqrer.  Also 
many  thanks  go  to  A.  Nachman  of  AFOSR. 


CPT-ADI  At  =  Ax 


No.  of 
Hx=hy  cells 

No.  of 
iters 

CPU  Time 
(sec) 

Max  Fct 
Error 

Max  Derivative 
Error 

16 

16 

1.183 

5J563E-03 

5.33550E-03 

32 

32 

9.26 

1J391E-03 

U390E-03 

64 

64 

73.23 

33479E-03 

33477E-03 
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FDM-SOR  At  =  Ax 


No.  of 
Hx=hy  cells 

No.  of 
iters 

CPU  Time 
(sec) 

Mm  Fct 
Error 

Mm  Derivative 
Error 

16 

240 
c»=  1.65 

2.10 

3.5050E-03 

5.869E-01 

32 

960 

o>=1.80 

36.73 

2.155E-03 

2.940E-01 

64 

3392 

u>=  1.85 

473.18 

9A737E-04 

1.469E-01 
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Abstract 

A  recently  developed  Compact  Finite 
Difference  Scheme  is  applied  to  3-0 
diffusion  equations.  The  relative  merits  of 
CRT  -  ADI  is  compared  with  two  other 
computational  schemes  such  as  Finite 
Diff^nce  SOR  and  Fnite  Difference  ADI. 
The  numerics^  results  obtained  from  these 
diree  sdiemes  are  compared  to  known 
analytical  solutions.  The  primary  interest 
of  diis  paper  lies  in  vectorization  and 
parallel  processing.  CPT-ADI  is  the  fittest 
algoridim  than  both  FDM-ADI  and  FDM 
-SOR  as  is  evident  from  the  CPU  times. 


for  X  e  1  and  I » [I ..  U] . 

Uvide  die  interval  I  into  m  nonoverlapping 
subintervals; 

Ij=  {x/Xj.i/2  ^X^Xj+1/2} 
with  center  points 

and  interior  endpoints 

We  shall  adopt  finite  difference  notations 

and  u(i)  =  ui .  We  also  denote  : 


1.  Introduction 

We  shall  briefly  describe  the  Compaa 
Finite  difference  scheme  for  a  l-D  steady 
state  problem.  The  extensitm  to  a  3-D  case 
maybe  easily  done.  The  idea  behind 
this  approach  is  to  solve  the  differential 
equation  in  isolations  from  its  nei^boring 
subinteivals  (i.e.  compactly)  and  then 
extend  the  solution  in  the  laige  by  means  of 
the  continuity  cmiditions  for  the  flux  and 
the  temperature  across  the  boundaries  of 
the  contiguous  subintervals  (See  Rose[l]). 

We  consider  here  l-D  Steady  diffusion 
problem : 


Ic  =  (Center  Po  ints) 

Ig  s  {1Z3.  — ,  M-1}  (inte  rio  r  endpoints) 

— tj*  ^ 

I - • - 1 - • - 1 

*4  ♦j  ♦j-l 

Fig.  1 

The  discrete  equatimis  for  (1)  and  (2)  can  be 
written  as,  for  jelc. 


(1) 

D.E. 

u'  =  f 

(4) 

(2) 

d>'«u 

(3) 

B.C. 

d>»g 

(5  a) 
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and 

(5  b)  <Pj-(P34  =hjUj.| 

Next .  it  is  required  that  both  u  and  f  be 
cominuous  across  every  endpoint  common  to 
two  intervals : 

[u]i  =  [<P]i  =  0  forielg 

Using  this  continuity  condition  ,  [uJi  =  0.  in 
terms  of  <p  we  get 

•’i-i  *’‘4 

and  in  case  of  equal  intervals,  this  reduces  to 
q)i4  +  <Pi_l 

(6)  9i  - - ielg  (endpoints) 

Now,  from  (4)  and  (S),  we  get 

(7a)  <Pj  +  i  -  2<Pj +<Pj4  =  2h]fj,  ieic 
and  from  (6)  we  have 

(7b)  <Pj+j  -  2<pj +<Pj_^  *0,  iel^ 

The  equations  (7)  and  the  boundary 
conditions  leads  to  a  determined  system  of 
algebraic  systems  for  the  values  of  <p,  which 
can  be  solved  by  Thomas  Algorithm. 

The  extensioD  to  a  three  dimensional 
scheme  may  be  done  easily. 

Three  Dhnensional  stencil  is  shown  in 
Fig.  2. 


Consider  the  3-D  diffusion  equation  u,  =  u„ . 
Uy>  ^zz-  The  Compact  ADI  method  is  given 
by  equations  (9)  -  (12)  ; 


(8a) 

(8b) 


u!*4.-u"^.  1  1 


U?”i .  -  u!*^ .  3  1 

(8c)  +  2H”7^ 

^  l*J*K  l«J«K 

Uf+J  .  3 

(8d)  a  +  2ItiU 

where  Fi.j.k.  Q.j.k  and  Hi.j.k  are  finite 
difierence  approximation  to  u„  .Uyy  and  u^j 

reflectively  and  ^  •  Using  the  standard 
finite  difference  scheme,  we  get  the  folllowing 
equations  : 

=  2XxU*;y  |;^+(l-4X^)U®j  it+  j  k 

ijjcsl3,  ....3M-1,  Xx  -  AyS  4.0,  AxssAysAz. 
(10) 

iJJcsl3 . 3M-1,  Xz  =  Xy=  4.0,  AxsAysAz 

(11) 

“Jit-  ^  “Jit-i 

ijjcsl,3 . 3M-1,  Xz  *  Xjr=  4.0,  AxasAy=Az 


Fig.  2. 
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I  i:i  1 1-  4^,)  ufjii,  -  2?.^  uf*,!, ,  t 

>  i,j+i,k  y  -  i.j-i.k. 

i.jjc=1.3 . 2M-1.  =  Xy  =  4.0.  Ax=Ay=Az 

and  the  continuity  conditions  are 

(12.)  -u«S^j  .k+2u»>i-u«,V.k  -0 

ij.fes2,4 . 31  X,  *  Xy=  4.0,  Ax=Ay=Az 

The  four  algebraic  systons  <9),(9a).(10).(10a). 
(ll).(lla)  and  (12)  &  (12a)  can  be  solved  by 
die  tridiagooal  algorithm. 

The  Finite  Difference  ADI  euations  for  3-D 
diffiisimi  equation  are  given  below: 

(13)  -u5y-H4 

(14)  -u^ = 

u{*^*  k  “  2  2 

(15)  + 

Here  G  =  . 

Using  the  standard  Finite  Difference  scheme 
we  get  tbe  following  equations  : 

(16)  =Xyii||.^H  +  X,u“j_j  ||  +  XX,j,k+i  + 

.k-1  23^11^ jj. 


+ o+2x^ui:i‘,k  - 

('«)  -^“r:t.j.k*>n>“r7i.j.k-^vrii,k 

-vrii.k^‘-2^-2^)“rfk 

Fmally .  the  Finite  Difference  SOR  scheme  for 
the  3-D  Diffusion  Equatitm  is  given  below  : 

(l-^6X)Utemp  *  u»  j ,  k  -t-  Atfff*  j ,  ^  + 

1  i  .k  +  1.  j  .k 

+  «f!j.i.k  +  »nU-i+«f!j.k.n) 

“f!jU  ®  (1"^)  “fl j  .k  +  foUiemp,  1  ^  0)  ^  2, 
Here  X  »  4.0  and  m  is  tbe  iteration  index. 


2.  Numerical  Experiments 

One  of  die  primary  objectives  of  diis  projea  is 
to  vecioiizaticD  of  the  3'D  problems.  We  use 
red-Uack  SOR  algoridim  to  vectorize  die  3-D 
heat  equation  for  the  Cray  Y-MP. 

We  consider  one  analytical  solutions  for  die 
foUowing  diffusion  equation  and  compare 
tfMwi  with  respective  numerical  solution.  The 
results  are  lecotded  in  Table  1. 

The  3-D  diffusion  equation  is  Ut 
•Hin+  f(x,y,z4).  0<x<l,0<y<1.0<z<l. 
t>0. 

Tbe  following  analytical  solutimis  is 
considered: 

u(x,y.z.t)  =  l(X)x(l-x)y(l-y)z(l-z)e-*  and 
f(x.yAt)*(200(^-y2)(z-22)+(y-y2)(x-x2)+ 
(z-z2)(x-x2))  -  100x(l-x)y(l-y)z(l-z))e-‘ 

All  die  problems  mentioned  above 
are  associated  with  Diikhlet  boundary 
conditions. 
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The  CPT-ADI  is  compared  with  FDM-ADI 
and  FDM-SOR .  The  Functional  eirors  in 
the  method  is  0(h2)  and  derivative  error 
is  0(h).  The  result  is  shown  in  Table  1. 

CPT-ADI 


PUTime 

MaxFct 

MaxDer 

(sec) 

Error 

Eiru' 

0.0042 

0.068S 

03828 

0.0912 

0.0341 

0.1326 

23406 

0.0046 

0.01S9 

MaxError 


«  of  Calls 


FDM-ADI 


iter 

CPU  Time 
(sec) 

MaxR:t 

Error 

MaxDer 

Error 

16 

0.012 

0.00977 

0.6179 

64 

0366 

0.00241 

03974 

256 

10.72 

0.00006 

0.1461 

The  curve  (1)  is  for  the  OT-ADI 
scheme,  curve  is  for  the  FDM-ADI 
scheme  while  curve(3)  is  for  the  FDM 
-SOR.  schenae. 

MaxOerError 


FDM-SOR 


8  12  16  20  24 


*  of  Calls 


iter 

CPUTime  j 
(sec) 

MaxFCt 

Error 

MaxDer 

Error 

omegi 

36 

1.45 

0.0043 

0.00318 

0384 

72 

0.1008 

0.00103 

03909 

1.55 

304 

1.75 

3.158 

0.00025 

i 

0.1449 

curve  (1)  above  shows  the  gnq^of^ 
tnaxder  error  Vs.  the  number  of  cells  for 
the  CPT-ADI  schone.  Qirve  (2)  shows  the 
grqA  of  maxder  error  Vsjiuniba  of  cells 
for  the  FDM-SOR  scheme  and  curve(3) 
shows  the  graph  of  maxder  error  Vs.  Ae 
number  of  cells  for  the  FDM-ADI  scheme. 
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curve  (1)  is  for  tiic  CPT  -  ADI  scheme, 
curve  Q)  is  for  the  FDM-S(Xl  scheme  and 
curve  (3)  is  for  the  FDM-ADI  scheme. 

3.  Condudiiig  Remarks 


The  X,  y.  and  z  (knnain  are  divided  into  4. 
8 , 16  cells  and  in  each  case,  the  initial  time 
is  taken  to  be  0  and  the  final  time  is  1.  We 
assume  all  the  material  constants  to  be  equal 
to  unity  and  Ax  s  Ay  =  Az.  In  case  of  FDM* 
SOR  and  CPT-ADI.  Ax  =  Ay  s  Az  ^t  and 
however  in  case  of  FDM-ADI  .At  s  (Ax)2. 
This  is  the  disadvantage  of  FDM-ADL 
However.  CFT-ADI  and  FDM-SOR  are  free 
fiom  this  difficulty.  But  SOR  has  a  different 
disadvantage,  for  eadi  timestep.  FDM-SOR 
takes  laige  number  of  iterations  to  conveige 
to  a  prea^gned  level  16  cells .  CFT- 
ADI  requires  16  iterations  to  reduce  die 
error  level  to  4.59  E  -3  .  whereas  FDM-ADI 
requires  256  iterations  to  reduce  tbe  eiror 
level  to  6.0  E -4.  whereas  FDM-SOR 
requires  304  iterations  to  reduce  the  error 
level  to  2.4872  E-4. 

In  comparing  CPU  times  among  all  die 
three  apfroachK.  it  is  found  that  CPU  time 
for  FDM-ADI  is  the  worst  For  the 
problem  with  16  cells .  CPT-ADI  takes 
only 2.346  sec.  However.  FDM-ADI 
approach  takes  10.72  secs  for  tbe  same 

problem.  _ 

In  terms  of  derivative  enror.  CPT-ADI 
is  the  best,  it  is  dose  to  the  order  of  OQi?). 
The  error  analysis  for  the  three  ^iproaches 
are  shown  in  Figure  3. 
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